REMARKS/ARGUMENTS 

Prior to the present amendment, Claims 58-70 were pending in this application. With this 
amendment, Claims 64-68 have been canceled without prejudice and Claims 58-63 have been 
amended. Claims 58-62 and 69-70 are pending after entry of the instant amendment. The 
specification has been amended to correct formal errors as discussed below. The amendments to 
the specification and claims are fiiUy supported by the specification and claims as originally filed 
and do not constitute new matter. Applicants expressly reserve the right to pursue any canceled 
matter in subsequent continuation, divisional or continuation-in-part applications. 

I. Specification 

The disclosure was objected to because, according to the rejection, the correct priority 
date has not been submitted, while the specification of related application (U.S. Apphcation 
Serial No. 09/999,829) has been amended to recite the correct priority information. Applicants 
would like to draw the Examiner*s attention to the Preliminary Amendment submitted on 
August 21, 2002, presenting the correct priority information for the present application. The 
priority information recited in the Preliminary Amendment is identical to that of U.S. 
Application Serial No. 09/999,829. Thus, the correct priority information has been submitted 
and this objection should be withdrawn. 

The disclosure was also objected to for containing errors in regard to SEQ ID NOs: 505 
and 506. In particular, the Examiner stated that " on page 101, lines 16 and 17 and 31-32, the 
PR0213-1 of SEQ ID NO: 506 is stated as having 295 amino acids, but in the sequence listing, 
SEQ ID NO: 506 has 273 amino acids." The Examiner further stated that "on page 309, 
lines 10-12, the start and stop codons recited for SEQ ID NO: 505 is incorrect." Applicants have 
amended the specification to recite that " the PR0213-1 of SEQ ID NO: 506 has 273 amino 
acids" and have changed the start codon of SEQ ID NO: 505 from "336-338" to "398-401" and 
the stop codon of SEQ ID NO: 505 from "1221-1223" to "1220-1222". The amendments are 
supported by the specification as originally filed, and do not constitute new matter. Support for 
the amendments can be found in Figure 212 and Figiire 213. 

As requested by the PTO, Applicants have amended the specification to correct the 
ATCC address on page 372, line 34. Further, the paragraph beginning at page 374, line 32, has 
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been amended to comply with the provisions of the Budapest Treaty. In addition, the 
specification has been amended to remove all embedded hyperlinks and/or other forms of 
browser-executable codes. 

II. Double Patenting 

The Examiner alleges that "there are a series of apphcations in which SEQ ID NO: 506 is 
present but do not claim the polypeptide." The Examiner has requested that Applicants point out 
to the Examiner all double patenting issues. 

To our best knowledge, Applicants have not filed any applications having claims directed 
to a polypeptide of a sequence identical to SEQ ID NO: 506. Apphcants beheve that the 
Examiner reached his conclusion of the existence of possible conflicting claims based on the 
disclosure of the publications of other U.S. apphcations filed by Applicants, which do not 
reflect the changes made in preliminary amendments in those applications. 

III. Priority 

The PTO asserts that Apphcants are entiUed to the priority of the filing date of the present 
appUcation, October 15, 2001 only since the current application is not enabling for the nucleic 
acid of SEQ ID NO: 505. In particular, The PTO alleges that the exact same sequence has been 
given two different names (PR0213-1 and PRO1330) and is duplicated in the sequence hsting. 
The PTO fiirther alleges that confiising information exists regarding the gene amplification data 
for the various molecules. For example, the value given for PR0213 in Table 3 of the 
provisional application Serial No. 60/131,445 is identical to that given for PR0213-1 of the 
present apphcation, therefore it appears that the values of Table 9 of the present application for 
PR0213-1 are actually the values for PR0213. In addition, it appears that the gene amphfication 
data of PRO1330 (PR0213-1) of the present application is same as that of PR0213 of Table 3 of 
60/131,445, and therefore the gene amplification data of PRO1330 (PR0213-1) is not present in 
the instant application. See page 4 of the instant Office Action. 

The duplicated sequence (SEQ ID NO: 508) has been deleted in the revised sequence 
listing. Therefore, the present application is in sequence compliance now. In addition, 
Applicants respectfiiUy submit that PR0213-1 was incorrectly designated as PR0213 in U.S. 
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. Provisional Application Serial No. 60/131,445. Therefore, the ACt value for PR0213 in 
60/131,445 is actually the ACt value for PR0213-L PRO 1330 is identical to PR0213-1, and 
therefore, the ACt value for PRO1330 is the same as that of PR0213-1 in the present application 
and also the same as that of PR0213 of 60/131,445. In addition, PR0213 and PR0213-1 are 
actually the same molecule, both being the same polypeptide encoded by the full-length coding 
sequence of DNA30943. Due to sequencing errors, the amino acid sequence of PR0213 differs 
from the sequence of PR0213-1 in a few positions. PR0213-1 is the correct sequence. Since 
the PR0213 and PR0213-1 polypeptides are in fact the same molecule, it is not surprising that 
they display the same biological function. Since the gene ampUfication data for PR0213-1 is 
disclosed in both the present application and provisional application U.S. Provisional 
Application Serial No. 60/131,445, claims directed to the PR0213-1 polypeptides are clearly 
supported by the disclosure of U.S. Provisional Application Serial No. 60/131,445. 

Applicants rely on the gene amplification assay for patentable utility which was first 
disclosed in U.S. Provisional Application Serial No. 60/131,445, filed April 28, 1999, priority to 
which has been claimed in this application. Accordingly, the present application is entitled to at 
least an effective filing date of April 28, 1999. 

IV. Claim Rejections Under 35 U.S.C. 101 and 112, First Paragraph (Enablement) 

Claims 58-70 stand rejected under 35 U.S.C. §101 allegedly "because the claimed 
invention is not supported by either a substantial and specific asserted utility or a well 
established utility." (Page 4 of the instant Office Action). Claims 58-70 are further rejected 
under 35 U.S.C. §112, first paragraph allegedly because one skilled in-the art would not know 
how to use the claimed invention "since the claimed invention is not supported by either a 
specific and substantial asserted utihty or a well established utility." (Page 9 of the instant 
Office Action). The Examiner specifically notes that "the nucleic acids have utility as cancer 
marker.. . . However, the protein does not have any specific and substantial utility, or a well- 
established utihty" and therefore concludes that no asserted utility is specific for PR0213-1 
protein. The Examiner also asserts that the data showing the amplification of the nucleic acids 
encoding PR0213-1 is not indicative of a use of the encoded polypeptide as a diagnostic or 

therapeutic agent. Further, the Examiner alleges that since the data are not corrected for 
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aneuploidy, and because it does not necessarily follow that an increase in gene copy number 
results in increased gene expression, the data do not support the implicit assertion that PR0213-1 
can be used as a cancer diagnostic. The Examiner further quotes exemplary references like 
Pennica et al and Gygi et al to show that "it does not necessarily follow that an increase in gene 
copy numbers results in increased gene expression and increased protein expression, such that 
antibodies would be useful diagnostically or as target for cancer drug development." For the 
reasons outlined below, Applicants respectfully disagree. 

Applicants submit that the cancellation of Claims 64-68 renders the rejection of this 
claim moot. With respect to Claims 58-63 and 69-70, Applicants submit, as discussed below, 
that not only has the PTO not established a prima facie case for lack of utility, but that the 
claimed polypeptides possess a specific and substantial asserted utility. 

Utility - Leeal Standard 

According to the Utility Examination Guidelines ("Utility Guidelines"), 66 Fed. 
Reg. 1092 (2001) an invention complies with the utility requirement of 35 U.S.C. §101, if it has 
at least one asserted "specific, substantial, and credible utility" or a "well-estabUshed utility." 

Under the Utility Guidelines, a utihty is "specific" when it is particular to the subject 
matter claimed. For example, it is generally not enough to state that a nucleic acid is useful as a 
diagnostic without also identifying the conditions that is to be diagnosed. 

The requirement of "substantial utility" defines a "real world" use, and derives fi'om the 
Supreme Court's holding in Brenner v. Manson, 383 U.S. 519, 534 (1966) stating that "[t]he 
basic quid pro quo contemplated by the Constitution and the Congress for granting a patent 
monopoly is the benefit derived by the public from an invention with substantial utility." In 
explaining the "substantial utility" standard, M.P.E.P. §2107.01 cautions, however, that Office 
personnel must be careful not to interpret the phrase "immediate benefit to the public" or similar 
formulations used in certain court decisions to mean that products or services based on the 
claimed invention must be "currently available" to the public in order to satisfy the utility 
requirement. "Rather, any reasonable use that an applicant has identified for the invention 
that can be viewed as providing a public benefit should be accepted as sufficient, at least 
with regard to defining a "'substantial"* utility."' (M.P.E.P. §2107.01, emphasis added.) hideed, 
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the Guidelines for Examination of Applications for Compliance With the Utility Requirement, 
set forth in M.P.E.P, §2107 II (B) (1) gives the following instruction to patent examiners: "If the 
applicant has asserted that the claimed invention is useful for any particular practical purpose . . . 
and the assertion would be considered credible by a person of ordinary skill in the art, do not 
impose a rejection based on lack of utility." 

Finally, the Utility Guidelines restate the Patent Office's long established position that 
any asserted utihty has to be "credible." "Credibility is assessed from the perspective of one of 
ordinary skill in the art in view of the disclosure and any other evidence of record . . . that is 
probative of the appHcant's assertions." (M.P.E.P, §2107 II (B) (1) (ii)) Such a standard is 
presumptively satisfied unless the logic imderlying the assertion is seriously flawed, or if the 
facts upon which the assertion is based are inconsistent with the logic underlying the assertion 
(Revised Interim Utility Guidelines Training Materials, 1999). 

The PTO also sets forth the evidentiary standard as to utility rejections. In general, an 
Applicant's assertion of utiUty creates a presumption of utility that will be sufficient to satisfy the 
utility requirement of 35 U.S.C. §101, "unless there is a reason for one skilled in the art to 
question the objective truth of the statement of utility or its scope." In re Langer, 503 F.2d 
1380,1391, 183 USPQ 288, 297 (CCPA 1974). See, also In re Jolles, 628 F.2d 1322, 206 USPQ 
885 (CCPA 1980); In re Irons, 340 F.2d 974, 144 USPQ 351 (1965); In re Sichert, 566 F.2d 
1154, 1159, 196 USPQ 209,212-13 (CCPA 1977). 

Compliance with 35 U.S.C. §101 is a question of fact. Raytheon v. Roper, 724 F.2d 951, 
956, 220 USPQ 592, 596 (Fed. Cir. 1983) cert, denied, 469 US 835 (1984). The evidentiary 
standard to be used throughout ex parte examination in setting forth a rejection is a 
preponderance of the totality of the evidence under consideration. In re Oetiker, 977 F.2d 1443, 
1445, 24 USPQ2d 1443, 1444 (Fed. Cir. 1992). Thus, to overcome the presumption of truth that 
an assertion of utility by the applicant enjoys, the Examiner must establish that it is more likely 
than not that one of ordinary skill in the art would doubt the truth of the statement of utility. 
Only after the Examiner made a proper prima facie showing of lack of utility, shifts the burden 
of rebuttal to the applicant. The issue will then be decided on the totaUty of evidence. 
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Proper Application of the Legal Standard 

As discussed below under the section on "priority". Applicants rely on the gene 
amplification data for patentable utility for the claimed polypeptides. 

Gene amplification is an essential mechanism for oncogene activation. The gene 
amplification assay is well-described in Example 1 14 of the present application, the inventors 
isolated genomic DNA fi^om a variety of primary cancers and cancer cell lines that are listed in 
Table 9, including primary lung cancers of the type and stage indicated in Table 8 (page 546). 
As a negative control, DNA was isolated from the cells often normal healthy individuals, which 
was pooled and used as a control. Gene amplification was monitored using real-time quantitative 
TaqMan™ PGR. The gene ampHfication results are set forth in Table 9. Further, Example 1 14 
explains that the results of TaqMan^" PGR are reported in AGt units, wherein one unit 
corresponds to one PGR cycle or approximately a 2-fold amplification relative to control, two 
units correspond to 4-fold amplification, 3 units to 8-fold amplification etc. PR0213-1 showed 
approximately 1.03-5.55 AGt units which corresponds to 2^^"^ -2^^^ fold (more than 2 fold) 
amplification in human lung and colon tumors, which is significant and thus the PR0213-1 
polypeptide or a portion thereof, such as a polypeptide comprising amino acid residues 35-273, 
has utility as a diagnostic marker of lung or colon cancer. 

It is well known that gene amplification occurs in most solid tumors, and generally is 
associated with poor prognosis. 

In support. Applicants submit a Declaration by Dr. Audrey Goddard with this response 
and particularly draw the Examiner's attention to page 3 of the declaration which clearly states 
that: 

It is fiirther my considered scientific opinion that an at least 2-foId increase in 
gene copy number in a tumor tissue sample relative to a normal (i.e., non- 
tumor) sample is significant and useful in that the detected increase in gene 
copy number in the tumor sample relative to the normal sample serves as a basis ' 
for using relative gene copy number as quantitated by the TaqMan PGR 
technique as a diagnostic marker for the presence or absence of tumor in a tissue 
sample of unknown pathology. Accordingly, a gene identified as being 
amplified at least 2-fold by the quantitative TaqMan PGR assay in a tumor 
sample relative to a normal sample is useful as a marker for the diagnosis of 
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cancer, for monitoring cancer development and/or for measuring the efficacy of 
cancer therapy. (Emphasis added). 

The attached Declaration by Audrey Goddard clearly establishes that the TaqMan real- 
time PGR method described in Example 114 has gained wide recognition for its versatiUty, 
sensitivity and accuracy, and is in extensive use for the study of gene amplification. The facts 
disclosed in the Declaration also confirm that based upon the gene ampUfication results, one of 
ordinary skill would find it credible that PR0213-1 or a portion thereof is a diagnostic marker of 
human lung or colon cancer. 

Secondly, regarding the Examiner's point that "none of [the] asserted utilities is 
specific for the disclosed PR0213-1 proteins or antibodies," Applicants submit, as 
discussed below, that the Examiner has not established a prima facie case for lack of 
utility for the claimed polypeptides. 

A prima facie case of lack of utility has not been established 

The Examiner bases the assertion, that increases in gene copy number do not reliably 
correlate with increased gene expression or polypeptide expression, on exemplary literature 
reports like Pennica et aL and Gygi et aL and hence concludes that the PR0213-1 polypeptides 
and their antibodies lack utility. 

According to the Examiner, Pennica et aL teaches that "An analysis of WISP-l gene 
ampUfication and expression in human colon tumors showed a correlation between DNA 
ampliflcation and over-expression, .... In contrast, WISP-l DNA was amplified in colon 
tumors, but its mRNA expression was significantly reduced in the majority of tumors compared 
with expression in normal colonic mucosa fi"om the same patient." (Emphasis added). Firstly, 
Applicants draw attention to Pennica's showing that "a correlation between DNA amplification 
and over-expression exists for the WISP-l gene"in 84% of the tumors examined. While Pennica 
discloses a lack of correlation for the WISP-2 gene, Pennica teaches nothing regarding such a 
lack of correlation in genes in general . That is, Pennica's teachings are specific for the WISP 
family of genes, and are not directed to genes in general. The Utility Guidelines requires that for 
a prima facie showing of lack of utility, the Examiner has to provides evidence that it is more 
likely than not that a lack of correlation between protein expression and gene amplification 
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exists, in general . Accordingly, Applicants respectfully submit that Pennica teaches nothing of 
the correlation between gene amplification and polypeptide over-expression in general. 

Further, the Examiner cites the Gygi et al. reference to establish that "even if gene 
amplification correlates with increased transcription, it does not always follow that protein levels 
are also amplified.*' The Examiner adds that "Gygi et al. studied 150 proteins. . . and found no 
strong correlation between proteins and transcript levels." Applicants respectfiiUy traverse and 
point out that, on the contrary, Gygi et al never indicate that the correlation between mRNA and 
protein levels does not exist. Gygi et al. only state that the correlation may not be sufficient in 
accurately predicting protein level fi-om the level of the corresponding mRNA transcript 
(Emphasis added) (see page 1270, Abstract). Contrary to the Examiner's statement, the Gygi 
data indicate a general trend of correlation between protein [expression] and transcript levels 
(Emphasis added). For example, as shown in Figure 5, the mRNA abundance of 250-300 copies 
/cell correlates with the protein abundance of 500-1000 x 10^ copies/cell. The mRNA abundance 
of 100-200 copies/cell correlates with the protein abundance of 250-500 x 10 copies/cell 
(emphasis added). Therefore, high levels of mRNA generally correlate with high levels of 
proteins. In fact, most data points in Figure 5 did not deviate or scatter away fi:'om the general 
trend of correlation. Thus, the Gygi data, meets the "more likely than not standard" and shows 
that a positive correlation exists between mRNA and protein. Therefore, Applicants submit that 
the Examiner's rejection is based on a misrepresentation of the scientific data presented in Gygi 
et aL 

In conclusion, the Examiner has not shown that a lack of correlation between gene 
amplification: polypeptide over-expression, as observed for the WISP-2 or the abl genes, is 
typical. In fact, contrary to what the Examiner contends, the art indicates that, if a gene is 
amplified in cancer, it is more likely than not that the encoded protein will be expressed at an 
elevated level. As noted even in Pennica et al, a correlation between DNA amplification: 
polypepfide over-expression was observed in the case of WISP-1 and similarly, in Gygi et aL, 
most genes showed a correlation between increased mRNA : translated protein. Since the 
standard is not absolute certainty, a prima facie showing of lack of utiUty has not been made in 
this instance. 
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It is "more likely than not" for amplified genes to have increased mRNA andprotein 
levels 

Applicants submit further exemplary articles to show that, contrary to what the Examiner 
asserts, the art indicates that, generally, if a gene is amplified in cancer, it is more likely than 
not that the encoded protein will be expressed at an elevated level. For example, Omtofl et al 
(Mol. and Cell. Proteomics, 2002, Vol.l, pages 37-45) studied transcript levels of 5600 genes in 
malignant bladder cancers many of which were linked to the gain or loss of chromosomal 
material using an array-based method. Omtoft et al showed that there was a gene dosage effect 
and taught that "in general (18 of 23 cases) chromosomal areas with more than 2-fold gain of 
DNA showed a corresponding increase in mRNA transcripts" (see column 1, abstract), hi 
addition, Hyman et al (Cancer Res., 2002, Vol. 62, pages 6240-45) showed, using CGH analysis 
and cDNA microarrays which compared DNA copy numbers and mRNA expression of over 
12,000 genes in breast cancer tumors and cell lines, that there was "evidence of a prominent 
global influence of copy number changes on gene expression levels." (see page 6244, column 1, 
last paragraph). Additional supportive teachings were also provided by Pollack et al, (PNAS, 
2002, Vol. 99, pages .12963-12968) who studied a series of primary human breast tumors and 
showed that ". . .62% of highly amplified genes show moderately or highly elevated expression, 
and DNA copy number influences gene expression across a wide range of DNA copy number 
alterations (deletion, low-, mid- and high-level amplification), and that on average, a 2-fold 
change in DNA copy number is associated with a corresponding 1.5 -fold change in mRNA 
levels." Thus, these articles collectively teach that in general, gene amplification increases 
mRNA expression. 

In addition, enclosed is a Declaration by Dr. Polakis, principal investigator of the Tumor 

Antigen Project of Genentech, Inc., the assignee of the present application to show that mRNA 

expression correlates well with protein levels, in general. As Dr. Polakis explains, the primary 

focus of the microarray project was to identify tumor cell markers useful as targets for both the 

diagnosis and treatment of cancer in huriians. The scientists working on the project extensively 

rely on results of microarray experiments in their effort to identify such markers. As Dr. Polakis 

explains, using microarray analysis, Genentech scientists have identified approximately 200 gene 

transcripts (mRNAs) that are present in human tumor cells at significantly higher levels than in 
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corresponding normal human cells. To date, they have generated antibodies that bind to about 
30 of the tumor antigen proteins expressed from these differentially expressed gene transcripts 
and have used these antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. Having compared 
the levels of mRNA and protein in both the tumor and normal cells analyzed, they found a very 
good correlation between mRNA and corresponding protein levels. Specifically, in 
approximately 80% of their observations they have found that increases in the level of a 
particular mRNA correlates with changes in the level of protein expressed from that mRNA. 
While the proper legal standard is to show that the existence of correlation between mRNA and 
polypeptide levels is more likely than not, the showing of approximately 80% correlation for the 
molecules tested in the Polakis Declaration greatly exceed this legal standard. Based on these 
experimental data and his vast scientific experience of more than 20 years, Dr. Polakis states 
that, for human genes, increased mRNA levels typically correlate with an increase in abundance 
of the encoded protein. He further confirms that "it remains a central dogma in molecular 
biology that increased mRNA levels are predictive of corresponding increased levels of the 
encoded protein." 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology, that there is a correlation between polypeptide 
and mRNA levels, these instances are exceptions rather than the rule. In the vast majority of 
amplified genes , the teachings in the art, as exemphfied by Omtoft et al, Hyman et al, Pollack 
et al, and the Polakis declaration, overwhelmingly show that gene amplification influences gene 
expression at the mRNA and protein levels. Thus, one of skill in the art would reasonably expect 
in this instance, based on the amplification data for the PR0213-1 gene, that the PR0213-1 
protein or a portion thereof, is concomitantly overexpressed. Thus, Applicants submit that the 
claimed polypeptides including the PR0213-1 proteins have utility in the diagnosis of cancer and 
based on such a utility, one of skill in the art would know exactly how to use the claimed 
polypeptides for diagnosis of cancer. 
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Even if a prima facie case of lack of utility has been established, it should be 
withdrawn on consideration of the totality of evidence 

Assuming arguendo that it is more likely than not that there is no correlation between 

gene amplification and increased mRNA/protein expression, which Applicants submit is not true, 

a polypeptide encoded by a gene that is amplified in cancer would still have a credible, specific 

and substantial utility. In support, Applicants submit a Declaration by Avi Ashkenazi, Ph.D., an 

expert in the field of cancer biology and an inventor of the instant application. Dr. Avi 

Ashkenazi' s Declaration explains that: 

even when amplification of a cancer marker gene does not result in 
significant over-expression of the corresponding gene product, this very 
absence of gene product over-expression still provides significant 
information for cancer diagnosis and treatment. Thus, if over-expression 
of the gene product does not parallel gene amplification in certain tumor 
types but does so in others, then parallel monitoring of gene amphfication 
and gene product over-expression enables more accurate tumor 
classification and hence better determination of suitable therapy. Li 
addition, absence of over-expression is crucial information for the 
practicing clinician. If a gene is amplified but the corresponding gene 
product is not over-expressed, the clinician accordingly will decide not to 
treat a patient with agents that target that gene product. 

Applicants thus submit that simultaneous testing of gene amplification and gene product 
over-expression enables more accurate tumor classification, even if the gene-product, the protein, 
is not over-expressed. This leads to better determination of a suitable therapy. Further, as 
explained in Dr. Ashkenazi's Declaration, absence of over-expression of the protein itself is 
crucial information for the practicing clinician. If a gene is amplified in a tumor, but the 
corresponding gene product is not over-expressed, the clinician will decide not to treat a patient 
with agents that target that gene product. This not only saves money, but also the patient need 
not be exposed to the side effects associated with such agents. 

This is ftirther supported by the teachings of the attached article by Hanna and Momin. 

The article teaches that the HER-2/neu gene has been shown to be amplified and/or 

over-expressed in 10%-30% of invasive breast cancers and in 40%-60% of intraductal breast 

carcinoma. Further, the article teaches that diagnosis of breast cancer includes testing both the 

amplification of the HER-2/neu gene (by FISH) as well as the over-expression of the HER-2/neu 
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gene product (by IHC). Even when the protein is not over-expressed, the assay relying on both 
tests leads to a more accurate classification of the cancer and a more effective treatment of it. 

Thus, Applicants have demonstrated a credible, specific and substantial asserted utility 
for the claimed polypeptides, for example, in detecting over-expression or absence of expression 
of the claimed polypeptides. Further, based on this utility and the disclosure in the specification, 
one skilled in the art at the time the application was filed would know how to use the claimed 
polypeptides. 

Claims 58-62, and 69-70 stand rejected under 35 U.S.C. §112, first paragraph, because 
"the specification does not enable any person skilled in the art to which it pertains, or with which 
it is most nearly connected, to make or use the invention commensurate in scope with these 
claims." Specifically, the PTO alleges that "even if there were a patentable use of the protein of 
SEQ ID 506, variants of 80-99% identity would not be enabled because the specification has not 
taught one of ordinary skill in the art how to use them or fi-agments thereof." See page 9 of the 
instant Office Action. 

Without acquiescing to the rejection, Applicants submit that the cancellation of 
Claims 64-68 renders the rejection of these claims moot. Without acquiescing to the Examiner's 
position in the current rejections, and without prejudice to fiirther prosecution of the subject- 
matter in one or more continuation or divisional applications, Claims 58-62 (and, as a 
consequence, those claims dependent fi"om the same) have been amended to recite "wherein the 
nucleic acid encoding the polypeptide is amplified in colon or lung tumors" Since the claimed 
genus is now characterized by a combination of structural and fiinctional features, any person of 
skill would know how to make and use the invention without undue experimentation based on 
the general knowledge in the art at the time the invention was made. As the M.P.E.P. states, 
"The fact that experimentation may be complex does not necessarily make it undue, if the art 
typically engages in such experimentation" In re Certain Limited-charge cell Culture 
Microcarriers, 221 USPQ 1165, 1174 (Inf 1 Trade Common 1983), aff, sub nom,, Massachusetts 
Institute of Technology vAB, Portia, 11 A F.2d 1 104, 227 USPQ 428 (Fed. Cir. 1985) 
M.P.E.P. 2164.01. 
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Accordingly, Applicants submit that pending Claims 58-62 and 69-70 are enabled, as 
required by 35 U.S.C. §112, first paragraph. The PTO is respectfully requested to reconsider and 
withdraw the rejection of Claims 58-62, and 69-70 under 35 U.S.C. §112, first paragraph. 

In view of the foregoing discussion, Applicants request that the present 35 U.S.C. §101 
and §112, first paragraph rejections to the pending claims be withdrawn. 

V. Claim Rejections Under 35 V.S.C. $112, First Paragraph (Enablement) (ATCC 
Deposit) 

The PTO further alleges that Claims 58-63 and 68-70 stand rejected under 35 U.S.C. 
§112, first paragraph, because Applicants were not fiilly compliant with the Budapest Treaty. 
Specifically, the PTO states that Applicants must state that a viable culture of the deposit would 
be maintained for 30 years fi-om the date of deposit and for at least five (5) years after the most 
recent request for the furnishing of a sample of the deposit received by the depository. 

Without acquiescing to the rejection. Applicants submit that the cancellation of 
Claims 64-68 renders the rejection of these claims moot. Further, the sentence beginning on 
page 378, line 35 has been amended to state, "This assures maintenance of a viable culture of the 
deposit for 30 years from the date of deposit and for at least five (5) years after the most recent 
request for the furnishing of a sample of the deposit received by the depository." 

Accordingly, Applicants submit that all the requirements of 37 C.F.R. §1.806 are met and 
that Applicants are fully compliant with the requirements of the Budapest Treaty. Applicants 
therefore request the PTO to reconsider and withdraw the rejection of the pending claims under 
35 U.S.C. §112, first paragraph. 

VL Claim Rejections - 35 U.S.C. $112, Second paragraph 

Claims 58-70 are rejected under 35 U.S.C. § 1 12, second paragraph, as being indefinite 
for failing to particularly point out and distinctly claim the subject matter which applicant 
regards as the invention. The Examiner alleges that the claimed polypeptides are not identified 
as transmembrane proteins, and therefore the term "extracellular" is indefinite. 

Without acquiescing to the propriety of this rejection and solely in the interest of 
expedited prosecution in this case, the term "extracellular domain" is deleted and thus rendering 
the rejection moot 
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VIL Claim Rejections Under 35 U.S.C, §112. First Paragraph (Written Description) 

Claims 58-62, 69, and 70 stand rejected under 35 U.S.C. §112, first paragraph, for 
allegedly containing subject matter that was not described in the specification in such a way as to 
reasonably convey to one skilled in the relevant art that the inventors had possession of the 
claimed invention at the time the application was filed. In particular, the PTO notes that "the 
claims are drawn to polypeptides having at least 80%, 85%, 90%, 95% or 99% sequence identity 

with a particular disclosed sequence The claims do not require that the claimed 

polypeptides possess any particular biological activity . . . 

Without acquiescing to the Examiner's position in the current rejections, and without 
prejudice to further prosecution of the subject-matter in one or more continuation or divisional 
applications, Claims 58-62 (and, as a consequence, those claims dependent from the same) have 
been amended to recite "wherein the nucleic acid encoding the polypeptide is amplified in colon 
or lung tumors." 

Thus, this biological activity, coupled with a well defined, and relatively high degree of 
sequence identity are believed to sufficiently define the claimed genus, such that one skilled in 
the art would readily recognize that the Applicants were in the possession of the invention 
claimed at the effective fiUng date of this application. Hence, the present rejection should be 
withdrawn. 

VIII. Claim Rejections Under 35 U.S.C. § 102(e) 

Claims 58-69 stand rejected under 35 U.S.C. § 102(e) as allegedly being anticipated by 
Holtzman et al (U.S. Published Patent AppHcation 20020028508) ("Holtzman"), with an 
effective priority date of April 23, 1998. In particular, the Examiner alleges that Holtzman et al. 
disclose a protein that is 100% identical to the protein of SEQ ID NO: 506. 

Claims 58-62, 69 and 70 stand rejected under 35 U.S.C. §102(e) as allegedly being 
anticipated by Sheppard et al (U.S. Published Patent Application 20030166907) ("Sheppard"), 
with an effective priority date of June 18, 1997. In particular, the Examiner alleges that 
Sheppard et al disclose a protein that is 99% identical to the protein of SEQ ID NO: 506. 

Without acquiescing to the rejection, Applicants respectfully submit that the cancellation 
of Claims 64-68 renders the rejection of these claims moot. Claims 58-62 were amended to 
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recite "the polypeptide of residues 35-273 of SEQ ID NO: 506". In addition, Applicants 
respectfully submit Declarations under 37 C.F.R. §1.131 by Dr. Goddard, Dr. Godowski, 
Dr. Gumey, and Dr. Wood, that establish that Applicants had conceived and reduced to practice 
the invention corresponding to the disclosure of the cited references before June 18 1997, the 
effective priority date of Holtzman et aL and Sheppard et al. The consideration of the 
Declarations is respectfully requested. Applicants are in process of obtaining signatures of all 
the inventors of the present application. The Declaration with the signatures of all the inventors 
will be submitted to the PTO shortly. 

Applicants need to disclose only what is disclosed in the cited reference to support 
their priority claim 

Applicants respectfully submit that in order to overcome the 35 U.S.C. § 102(e) rejection 
over Holtzman et aL and Sheppard et aL and support the priority claim, the Declarations by 
Dr. Goddard, Dr. Godowski, Dr. Gumey, and Dr. Wood ("Declarations") simply need to provide 
a disclosure commensurate in scope with the disclosure in both Holtzman et aL and Sheppard 
et al. 

In order to remove a reference as a prior art, "[i]t is sufficient if [the affidavit under 
Patent Office Rule'131] shows that as much of the claimed invention as is taught in the reference 
has been reduced to practice by the [patentee] prior to the date of the reference." In re Stempel, 
241 F.2d 755, 757 (1957). In In re Stempel, the patent applicant (Stempel) had claims directed 
to both (i) a particular genus of chemical compounds (the "generic" claim) and (ii) a single 
species of chemical compound that was encompassed within that genus (the "species" claim). In 
support of a rejection under 35 U.S.C. §102, the examiner cited against the application a prior art 
reference that disclosed the exact chemical compound recited in the "species" claim. In response 
to the rejection, the patent applicant filed a declaration under 37 C.F.R. §1.131 demonstrating 
that he had made that specific chemical compound prior to the effective date of the cited prior art 
reference. The Court found the applicant's 37 C.F.R. § 1.131 declaration effective for swearing 
behind the cited reference for purposes of both the "species" claim and the "genus" claim. 
Specifically, the Court stated in support of its decision that "all the applicant can be required to 

-30- 

Amendment and Response to Office Action 
(Dated: June 2, 2004 —Paper NoTMail Date 05172004) 
Application Serial No. 09/978,191 
Attorney's Docket No. 39780-2630 P1C4 



show is priority with respect to so much of the claimed invention as the reference happens to 

show. When he has done that he has disposed of the reference." Id. at 759. 

Furthermore, the Examiner is respectfully directed to In re Moore, 170 USPQ 260 

(CCPA 1971), where the holding in In re Stempel was affirmed. In In re Moore, the patent 

applicant claimed a particular chemical compound in his patent application and the examiner 

cited against the applicant a prior art reference under 35 U.S.C. §102 rejection which disclosed 

the compound but did not disclose any specific utility for the compound. The patent applicant 

filed a declaration under 37 C.F.R. §1.131 demonstrating that he had made the claimed 

compound before the effective date of the cited prior art reference, even though he had not yet 

established a utility for that compound. On appeal, the Court indicated that the 131 declaration 

filed by the patent applicant was sufficient to remove the cited reference. The Court relied on the 

established "Stempel Doctrine" to support its decision, stating: 

An applicant need not be required to show [in a declaration under 37 C.F.R. 
§1.131] any more acts with regard to the subject matter claimed that can be 
carried out by one of ordinary skill in the pertinent art following the description 
contained in the reference ... the determination of a practical utility when one is 
not obvious need not have been accomplished prior to the date of a reference 
unless the reference also teaches how to use the compound it describes. 

In re Moore, 170 USPQ at 267 (emphasis added). 

Thus, In re Moore confirmed the holding in In re Stempel which states that in order to 
effectively remove a cited reference with a declaration under 37 C.F.R. §1.131, an applicant need 
only show that portion of his or her claimed invention that appears in the cited reference. 

Accordingly, Applicants respectfiiUy submit that the Declarations simply need to show 
possession of the polypeptide sequence and its encoding polynucleotide sequence and the 
homology analysis of the polypeptide as disclosed in Holtzman et al and Sheppard et al in order 
to overcome the 35 U.S.C. §102 rejection over these two references. 

As shown in the Declarations, Applicants respectfiiUy submit that Dr. Goddard, 

Dr. Godowski, Dr. Gumey, and Dr. Wood conceived and reduced to practice the PR0213 

polypeptide, which comprises amino acid residues 35-273 of SEQ ID NO:506 as claimed in the 

present application, and its encoding nucleic acid sequence, in the United States prior to 

June 18, 1997. The polypeptide encoded by the claimed nucleic acid sequence was also shown 
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to have homology to human gas 6 protein before the priority date of both prior art patent 
apphcations. 

As indicated in the Declarations and the brief description of Figure 1 of the present 
specification, the PR0213 polypeptide is encoded by DNA30943-1 163. 

Furthermore, as stated in the Declarations, the GSeqEdit database stores cloning, 
sequencing and functional information for each PRO polypeptide and its encoding nucleic acid 
sequences according to its DNA number. Copies of the pages from the GSeqEdit database report 
(with the dates redacted) showing the cloning and sequencing information for the PR0213 
polypeptide sequence and its encoding nucleic acid sequence are attached to the Declarations as 
Exhibit A. 

The GSeqEdit report shows the full length nucleic acid sequence for DNA30943-1 163 
(identified as "DNA30943") and the full length polypeptide sequence encoded by DNA30943- 
1 163. As evidenced from the report and stated in the Declarations, both the nucleic acid and 
amino acid sequences shown in Exhibit A were obtained prior to June 18, 1997. 

In addition, as stated in the Declarations, The DNA sequence of nucleotides 498 to 1216 
of the DNA 30943 sequence shown in the GSeqEdit report is identical to that of nucleotides 
500-1222 of SEQ ID NO:505 disclosed in the present appHcation. Further, The sequence of 
amino acid residues 54 to 295 of PR0213 polypeptide shown in the GSeqEdit report is identical 
to that of amino acids 35-273 of SEQ ID NO: 506 disclosed in the present application. In 
addition, the report indicates that the polypeptide is homologous to human gas6 protein. 
Accordingly, the Declarations along with attached Exhibit A clearly show that Applicants were 
in possession of DNA30943, the polypeptide encoded by DNA30943, and the homology 
functional information prior to June 18, 1997. Therefore, the Declarations clearly estabhsh that 
the claimed polypeptides and the nucleic acids encoding thereof, its homology function were 
conceived and reduced to practice prior to June 18, 1997. 

Consequently, based on the holdings of In re Stempel and In re Moore, Applicants 
respectfully submit that Holtzman et al and Sheppard et al are not prior art under 102(e) since 
their priority dates are after the date the instant invention was conceived and reduced to practice 
in the United States. 
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Accordingly, the Examiner is respectfully requested to reconsider and withdraw the 
rejection of Claims 58-69 under 35 U.S.C. § 102(e). 

IX, Claim Rejections Under 35 U.S.C. §1 03(a) 

Claims 70 stands rejected under 35 U.S.C. § 103(a) as being unpatentable over Holtzman 
et al in view of Hopp et al For the reasons outlined below, Applicants respectfully disagree 
with this rejection. 

To reject claims in an application under 35 U.S.C. §103, the PTO bears the initial burden 
of estabhshing a prima facie case of obviousness. In re Bell, 26 USPQ2d 1529, 1530 (Fed. 
Cir. 1993); MPEP § 2142. In order to establish prima facie obviousness, three basic criteria 
must be met. 

First, the prior art must provide one of ordinary skill in the art with a suggestion or 
motivation to modify or combine the teachings of the references relied upon by the PTO to arrive 
at the claimed invention. Second, the prior art must provide one of ordinary skill in the art with a 
reasonable expectation of success that the modification or combination suggested by the PTO 
would succeed. In re Dow, 5 USPQ2d 1529, 1531-32 (Fed. Cir. 1988). Third, the prior art, 
either alone or in combination, must teach or suggest each and every limitation of the rejected 
claims. In re Gartside, 53 USPQ2d 1769 (Fed. Cir. 2000) (Emphasis added). If any one of these 
criteria are not met, prima facie obviousness is not established, and Applicants are not required 
to show new or unanticipated results. In re Grabiak, 226 USPQ 870 (Fed. Cir. 1985). 

Applicants submit that the references cited by the PTO are not sufficient to establish a 
prima facie case of obviousness against Claims 70. 

As discussed in Section VIII, above, Holtzman et al are not prior art, because as 
evidenced by the Declarations of Dr. Goddard, Dr. Godowski, Dr. Gumey, and Dr. Wood, 
submitted herewith, the inventors had conceived and reduced the instant invention to practice in 
the United States before the effective priority date of this reference. 

As Holtzman et al are not prior art, the PTO has failed to estabUsh prima facie 
obviousness against Claim 70. Accordingly, Applicants request that the rejection of Claims 70 
under 35 U.S.C. § 103(a) as being as being unpatentable over Holtzman et al in view of Hopp et 
al be withdrawn. 
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CONCLUSION 



In conclusion, the present application is believed to be in prima facie condition for 
allowance, and an early action to that effect is respectfully solicited. Should there be any further 
issues outstanding, the Examiner is invited to contact the undersigned attorney at the telephone 
number shown below. 

Please charge any additional fees, including fees for additional extension of time, or 



credit overpayment to Deposit Account No. 08-1641 (referencing Attorney's Docket 
No. 39780-2630 P1C4) . 



HELLER EHRMAN WHITE & McAULIFFE LLP 

275 Middlefield Road 
Menlo Park, California 94025 
Telephone: (650) 324-7000 
Facsimile: (650) 324-0638 



SV 2065656 vl 

10/4/04 1 1:56 AM (39780.2630) 



Respectfully submitted, 



Date: October 4, 2004 




Ginger Dreger (Reg. No. 33,055J 
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Each year, over 1 82,000 women in the United States are 
diagnosed with breast cancer, and approximately 45,000 die 
of the disease.^ Incidence appears to be increasing in the 
United States at a rate of roughly 2% per year. The reasons 
for the increase are unclear, but non-genetic risk factors appear 
to play a large role.2 

Five-year survival rates range from approximately 65%- 
85%, depending on demographic group, with a significant 
percentage of women experiencing recurrence of their cancer 
within 10 years of diagnosis. One of the factors most predic- 
tive for recurrence once a diagnosis of breast cancer has been 
made is the number of axillary lymph nodes to which tumor 
has metastasized. Most node-positive women are given adju- 
vant therapy, which increases their survival. However, 20%- 
30% of patients without axillary node involvement also 
develop recurrent disease, and the difficulty lies in how to iden- 
tify this high-risk subset of patients. These patients could 
benefit from increased surveillance, early intervention, and 
treatment. ■ 

Prognostic markers currently uised in breast cancer recur- 
rence prediction include tiimor size, histological grade, steroid 
hormone receptor status, DNA ploidy, proliferative index, and 
cathepsin D status. Expression of growth factor receptors and 
over-expression of the HER-2/neu oncogene have also been 
identified as having value regarding treatment regimen and 
prognosis. 

HER-2/neu (also known as c-erbEi2) is an oncogene that 
encodes a transmembrane glycoprotein that is homologous 
to, but distinct from, the epidermal growth factor receptor. 
Numerous studies have indicated that high levels of expres- 
sion of this protein are associated with rapid tumor growth, 
certain forms of therapy resistance, and shorter disease-free 
survival. The gene has been shown to be amplified and/or 
overexpressed in 10%-30% of invasive breast cancers and in 
40%-60% of intraductal breast carcinoma.^ 

There are two distinct FDA-approved methods by which 
HER-2/neu status can be evaluated: immunohistochemistiy 
(IHC, HcrcepTest™) and FISH (fluorescent in situ hybridiza- 
tion, PathVysion*"^ Kit). Both methods can be performed on 
archived and current specimens. The first method allows visual 
assessment of the amount of HER-2/neu protein present on 
the cell membrane. The latter method allows direct quantifi- 
cation of the level of gene amplification present in the tumor, 
enabling differentiation between low- versus high-amplifica- 
ticth. At least one study has demonstrated a difference in 



recurrence risk in women younger than 40 years of age for 
low- versus high-amplified tumors (54.5% compared to 
85.7%); this is compared to a recurrence rate of 16.7% for 
patients with no HER-2/neu gene amplification.^ HER-2/neu 
status may be particulaiiy important to establish iii women with 
small (<1 cm) tumor size. 

The choice of methodology for determination of HER-2/ 
neu status depends in part on the clinical setting. FDA approval 
for the Vysis FISH test was granted based on clinical trials 
involving! 1549 node-positive patients. Patients received one 
of three different treatments consisting of different doses of 
cyclophosphamide, Adriamycin, and 5-fluorouracil (CAF). 
The study showed that patients with amplified HER-2/neu 
benefited from treatment with higher doses of adriamycin- . 
based therapy, while those with normal HER-2/neu levels did 
not: The study therefore identified a sub-set of women, who 
because they did not benefit from more aggressivjC treatment, 
did not need to be exposed to the associated side effects. In 
addition, other evidence indicates that H£R-2/neu amplifica- 
tion in node-negative patients can be used as an independent 
prognostic indicator for early recurrence, recurrent disease at; 
any time and disease-related death.^ Demonstration of HER- 
2/neu gene amplification by FISH has also been shown to be 
of value in predicting response to chemotherapy in stage-2 
breast cancer patients. 

Selection of patients for Herceptin^ (Trastuzurhab) mono- 
clonal antibody therapy, however, is based upon demonstrar 
lion of HER-2yneu protein overexpression using HercepTesf™. 
Studies using Herceptin^ in patients with metastatic breast 
cancer show an increase in time to disease progression, 
increased response rate to chemotherapeutic agents and a small 
increase in overall survival rate. The FISH assays have not yet 
been approved for this purpose, and studies looking at response 
to Herceptin^ in patients with or without gene amplification 
status determined by FISH are in progress. 

In general, FISH and IHC resulu correlate well. However, 
subsets of tumors are found which show discordarit results; 
i:e., protein overexpression without gene amplification or lack 
of protein overexpression with gene amplification. The clini- 
cal significance of such results is unclear. Based on the above 
considerations, HER-2/neu testing at SHMC/PAML will uti- 
lize immunohistochemistry (HercepTcst^ as a screen, fol- 
lowed by FISH in IHC-negative cases. Alternatively, either 
method may be ordered individually depending on the clini- 
cal setting or clinician preference. 
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H£R*2/Deu via IHC 

88342 (including interpretive report) 



HER-2/neu via FISH 3 
88271 x2 Molecular cytogenetics, DNA probe, each 
88274 Molecular cytogenetics, interphase in situ hybrid- ^ 

ization, analyze 25-99 cells 
8829 1 Cytogenetics and molecular cytogenetics, interpre- 
tation and report 5 

Procedural Information 

Immunohistochemistry is performed using the FDA-approved 
IDAKO antibody kit, HerceptcstO. The DAKO kit contains 
reagents required to, complete a two-step immunohisto- 
chemical staining procedure for.routinely processed, paraffin- 
embedded specimens. Following incubation with the primary 
rabbit antibody to human HER-2/neu protein, the kit employs 
a ready-to-use dextran-based visualization reagent. This re- 
agent consists of both secondary goat anti-rabbit antibody 
molecules with horseradish peroxidase molecules linked to a 
common dextran polymer backbone, thus eliminating the need 
for sequentiial application of link antibody and peroxidase 
conjugated antibody. Enzymatic conversion of the subse- 
quently added chromogen results in formation of visible 
reaction product at the antigen site. The spechnen is then coun- 
terstained; a pathologist using light-microscopy interprets 
results. 

FISH analysis at SHMC/PAML is performed using the 
FDA-approved PathVysion™ HER-2/neu DNA probe kit, pro- 
duced by Vysis, Inc. Fonmalin fixed, paraffin-embedded breast 
tissue is processed using routine histological methods, and then 
slides are treated to allow hybridization of DNA probes to the 
nuclei present in the tissue section. The Pathvysion™ kit con- 
tains two direct-labeled DNA probes, one specific for the 
alphoid repetitive DNA (CEP 1 7, spectrum orange) present at 
the chromosome 17 centromere and tiie second for the HER- 
2/neu oncogene located at 17ql 1.2-12 (spectrum green). Enu- 
meration of the probes allows a ratio of the number of copies 
of chromosome 17 to the number of copies of HER-2/ncu to 
be obtained; this enables quantification of low versus high 
. amplification levels, and allows an estimate of the percentage 
of cells with HER-2/neu gene amplification. The clinically 
relevant distinction is whether the gene amplification is due 
to increased gene copy number on the two chromosome 17 
homologues nonmally present or an increase in the number of 
chromosome 17s in the cells. In the majority of cases, ratio 
equivalents less than 2.0 are indicative of a normal/negative 
resuh, ratios of 2.1 and over indicate that amplification is 
present and to what degree. Interpretation of this data will be 
performed and reported from the Vysis-certified Cytogenet- 
ics laboratory at SHMC. 
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ABSTRACT 

Genetic changes , underlie tumor progression and may lead to cancer- 
specific expression of critical genes. Oyer 1100 publications have de- 
scribed the use of comparative genomic hybridization (CGH) to analyze , 
the pattern of copy number alterations in cancer, but very few of the genes 
affected are known. Here, we performed high-resolution CGH analysis on 
cDNA microarrays in breast cancer and directiy compared copy number 
and mRNA expression levels of 13,824 genes to quantitate the impact of 
genomic changes on gene expression. We identified and mapped the 
boundaries of 24 independent amplicons, ranging in size from 0.2 to 12 
Mb. Throughout the genome, both high- and low-level copy number 
changes had a substantial impact on gene expression, with 44% of the 
highly amplified genes showing overexpression and 10.5% of the highly 
overexpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 . genes whose expression levels across 14 
samples were system aticaUy attributable to gene amplification. These 
included most previously described amplified genes in breast cancer and 
many novel targets for genomic alterations, including the H0XB7 gene, 
the presence of which in a novel amplicon at ]7q2U was validated in. 
10.2% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on.cDNA microarrays revealed hundreds of 
novel genes whose overexpression is attributable to gene amplification. 
These genes may provide insights to the clonal evolution and progression 
of breast cancer and highlight. promising therapeutic targets. 



INTRODUCTION 

Gene expression patterns revealed by cDNA microarrays have 
facilitated classification of cancers into biologically distinct catego- 
ries, some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular . 
mechanisms underlying gene expression patterns in cancer have re- 
mained elusive, and the utility of gene expression profiling in the 
identification of specific therapeutic targets remains limited. 

Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate the effects 
of genetic changes may be important by highlighting transcripts that 
are actively involved in tumor progression. Such transcripts and their . 
encoded proteins would be ideal targets for anticancer ther^ies, as 
demonstrated by the clinical success of new therapies against ampli- 
fied oncogenes, such as ERBB2 md EGFR (7, 8), in breast cancer and 
other solid tumors. Besides amplifications of known oncogenes, over 



Received 5/29/02; accepted 8/28/02. 

The costs of publication of this article were de^yed in part by the payment of page 
charges. This article niu^ therefore be t^reby marked advertisement m accordance widi 
18 U.S.C Section 1734 solely to indicate this fact, 

' Suppbited in part by the Academy of Finland, Emil Aahoneo Foundation, iht Finnish 
Cancer Society, the Pirkanmaa Cancer Society, the Pirkamnaa Cultural Foundation, the 
- Finnish Breast Cancer Groiq), the Foundation for the Development of Laboratory Med- 
icine, the Medical Researdi Fund of the Tampere University Ho^ital, the Foundation for 
Commercial and Technical Sciences, and the Swedish Researdi CounciL . 

^ Supplementary data for this article arc available at CaiKxr Research Online (httpy/ 
cancerres.aacijournals.org). 

^ Contributed equally to this work. 

* To whom requests for reprints should be addressed, at Laboratory of Cancer Geztet- 
ics. Institute of Medical Tectmology, Lenkkeilijankstu 6, FIN-33520 Tampere, Fmland. 
Phone: 358-3247-4125; Fax: 358-3247-4168; E-mail: anncJcaIlioniemi@uta.fi. 




Exi^re^ssloh tatlb 

Fig. 1. Impact of gene copy number on global gene expression levels. A; percentage of 
over-, and underexpressed genes {Y am) according to copy rmmber^tios (X axis). 
Threshold values used for over- and undercxpression were >2.184 (global upper 7% of 
the^cDNA ratios) and <0.4826 (global lower 7% of the expression ratios). B, percentage 
of amplified and deleted genes according to eiqtression ratios. Threshold values for 
amplification and deletion were >1.5 and <0.7. 



20 recuhent regions of DNA amplification have been mapped in 
breast cancer by CGH^ (9, 10). However, these amplicons are often 
large and poorly defined, and their impact on gene expression remains 
unknown. 

We hypothesized that genome-wide identification of those gene 
expression changes that are attributable to underlying gene copy 
number alterations would highlight transcripts that are actively in- 
volved in the causation or maintenance of the malignant phenotype. 
To identify such transcripts, we applied a combination of cDNA and 
CGH microarrays to: (a) determine the global impact that gene copy 
number variation plays in breast cancer development and progression; 
and (b) identify and characterize those genes whose mRNA expres- 



^ The abbreviations used are: CGH, comparative genomic hybridization; FISH, fhio- 
resccocc in situ hybridization; RT-PCR, reverse tianscriptioii-PCR. * 



6240 




20 21^ V 



Fig;. 2. Genome-wide copy number and expression analysis in the MCF-7 breast cancer cell line. A, chromosomal CGH analysis of MCF-7. The copy number ratio profile (blue 
/inc) across the entire genome , from Ip telomere to Xq telomere is shown along with ±1 SO. {orange lines). The black horizontal line indicates a ratio of 1.0; rerf line, a ratio of 0.8; 
and green line, a ratio of 1.2. B-C, genome-wide copy number analysis in MCF-7. by CGH on cDNA microarray. The copy number ratios were plotted as a function of the position 
of the cDNA clones along the human genome. In B, individual data pomts are connected with a line» and a moving median of 10 adjacent clones is shown. Red horizontal line, the 
copy number ratio of 1 .0. In C, individual data points are labeled by color coding according to cDN A expression ratios. The bright red dots indicate the upper 2%, and dark red dots, 
the next 5% of the expression ratios in MCF-7 cells (overexpresscd genes); bright green dots indicate the lowest 2%, and dark green dots, the next 5% of the expression ratios 
(underexpressed genes); the rest of the observations are shown with black crosses. The chromosome numbers are shown at the bottom of the figure- and chromosome boundaries are 
indicated with a dashed line. 



sion is most significantly associated with amplification of the corre- 
sponding genomic template. 

MATERIALS AND METHODS 

Breast Cancer Cell Lines. Fourteen breast cancer cell lines (BT-20, BT- 
474, HCC1428, Hs578t, MCF7. MDA-36U MDA-436, MDA-453. MDA^68. 
SKBR-3. T-47D, UACC812, ZR-75-1, and ZR-75-30) were obtained from the 
American type Culture Collection (Manassas, VA). Cells were grown imder 
recommended culture conditions. Genomic DNA and mRNA were isolated 
using standard protocols! 

Copy Number and Expression Analyses by cDNA Microarrays. The 
preparation and printing of the 13,824 cDNA clones on glass slides were 
performed as described (1 1-13). Of these clones, 244 represented uncharac- 
terized expressed sequence tags, and the remainder corresponded to known 
genes. CGH experiments on cDNA microarrays were done as described (14, , 
15). Briefly, 20 ^g of genomic DNA from breast cancer cell lines and normal 
human WBCs were digested for 14-18 h with Alul dxid Rsal (Life Technol- 
ogies, Inc., Rpckville, MD) and purified by phenol/chloroforin extraction. Six 
^g of digested cell line DNAs were labeled with Cy3-dUTP (Amersham 
Pharmacia) and normal DNA with Cy5-dUTP (Amersham Pharmacia) using 
the Bioprime Labeling Idt (Life Technologies, Inc.). Hybridization (14, 15) and 
posthybridization washes (13) were done as described. For the expression 
analyses, a standard reference (Universal Himian Reference RNA; Stratagene, 
La Jolla,'CA) was used in all experiments. Forty /jig of reference RNA were 
labeled with Cy3-dUTP and 3.5 ttg of test mRNA with Cy5-dUTP, and the 
labeled cDNAs were hybridized on microarrays as described (13, 1 5). For both 
- microarray analyses, a laser confocal scanner (Agilent Technologies, Palo 
Alto, CA) was used to measure the fluorescence intensities at the target 
locations using the DEARRAY software (16). After background subtraction, 
average intensities at each clone in the test hybridization were divided by the, 
average intensity of the corresponding clone in the control hyl)ridization. For 
the copy number analysis, the ratios were normalized on the basis of the 
distribution of ratios of all targets on the array and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four times onto the 
anay. Low quality measurements {Le., copy mmiber data with mean reference 
intensity < 1 00 fhiorescent units, and expression data with both test and 
reference intensity <100 fluorescent units and/or with spot size <50 units) 



were excluded from the analysis and were treated as missing values. The 
distributions of fluorescence ratios were used to defme cutpoints for increased/ 
decreased copy number. Genes with CGH ratio. > 1.43 (representing the upper 
5% of the CGH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0:73 (representing the lower 5%) were considered to be 
deleted. 

Statistical Analysis of CGH and cDNA Microarray Data. To evaluate 
the influence of copy numl)er alterations on gene expression, we applied the 
following statistical approach. CGH and cDNA cahT)rated intensity ratios were 
log-nansformed and normalized using median centering of the values in each 
cell line. Furthermore, cDNA ratios for each gene across all 14 cell lines were 
median centered. For each gene, the CGH data were represented by a vector 
that was labeled 1 for amplification (ratio, >1.43) and 0 for no amplification. 
Amplification was correlated with gene expression using the signal-to-noise 
statistics (1). We calculated a weight, w^, for each gene as follows: 



where m^„ o-^i and m^o, o'gd denote the means and SDs for the expression 
levels for amplified and nonamplified cell lines, respectively. To assess the 
statistical significance of each weight, we performed 10,000 random permu- 
tations of the lat>el vector. The probability that a gene had a larger or equal 
weight by random permutation than the original weight was denoted by a. A 
low a (<0.05) indicates a strong association between gene expression and 
amplification. 

Genomic Localization of cDNA Clones and AmpHcon Mapping. Each 
cDNA clone on the microarray was assigned to a Unigehe cluster using the 
Unigene Build 141.^ A database of genomic sequence alignment information 
for mRNA sequences was created from the August 2001 freeze of the Uni- 
vCTsity of California Santa Cruz's GoldenPath database.^ The chromosome and 
bp positions for each cDNA clone were then retrieved by relating these data" 
sets. Amplicons were defined as a CGH copy number ratio.>2.0 in at least two 
adjacent clones in two or more cell lines or a CGH ratio >2.0 in at least three 
adjacent clones in a single cell line. The amplicon start and end positions were 



^ Interact address: >rttp'//mg«>Tf h nhg ri nih g nv/micmamiy/dofnmlnafiahle cdnaJitmL 
^ Internet address: www.gcnome.ucscxda. 
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Tabic i Summary of independent amplicons in 14 breast cancer ceil lines by 
CGH microarray \ 



Location 


Start (Mb) 


Enjd (Mb) 


Size (Mb) 


lpI3 


132.79 


132.94 


0.2 


la21 


173.92 


177.25 


3J 


Iq22 


179.28 


179.57 


0.3 


3pl4 


71.94 


74.66 


2.7 


.7pl2.!-7pn^ 


55.(52 


60.95 


5.3 


7q3I 


125.73 


130.96 


5.2 


7q32 


140.01 


140.68 


0.7 


8q21.n-8q2l.l3 


86.45 


92.46 


6.0 


8q2l.3 - 


98.45 


103.05 


4.6 


8q233-«q24.14 


129.88 


142.15 


I2J 


8q24^ 


151.21 


152.16 


!.0 


9pl3 


38.65 


39J25 


0.6 


i3q22-q31 


77.15 


8138 


4.2 


16q22 


86.70 


87.62 


0.9 


I7ql 1 


29.30 


30.85 


1.6 


]7ql2-<)21.2 


39.79 


42.80 


3.0 


17q2l32-q21.33 


52.47 


55.80 


3.3 


I7q22Hi23.3 


63.81 


69.70 . 


5.9 


I7(l233-q24.3 


69.93 


74.99 


5.1 


I9qI3 


40.63 


41.40 


0.8 


. 20qn.22 


34.59 


35.85 


1.3 


20qI3,I2 


44.00 


45.62 


1.6 


20qI3J2-ql3.13 


46.45 


49.43 


, 3.0 


20ql3:2-Kil3.32 


51.32 


59.12 


7.8 



CGH were validated, with lq21, 17ql2-q21.2, 17q22-q23, 20ql3.1, 
and 26ql3,2 regions being most commonly amplified. Furthemaore, 
the boundaries of these amplicons were precisely delineated. In ad- 
dition, novel ainplicons were identified at 9pl3 (38.65-39.25 Mb), 
and 17q2 1,3 (52.47-55.80 Mb). 

Direct Identification of Putative Amplification Target Genes. 
The cDNA/CGH microarray technique enables the direct correla- 
tion of copy number and expression data on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolution 
CGH plots with gene expression data using color coding. Fig. 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lpl3, 17q22-q23, and 20ql3 were highly overex- 
pressed; A view of chromosome 7 in the MDA-468 cell line 
implicates EGFR as the most highly overexpressed and amplified 
gene at 7pl l-pl2 (Fig. ^A). In BT-474, the two known amplicons 
at 17ql2 and 17q22-q23 contained numerous highly overex- 
pressed genes (Fig. 35). In addition, several genes, including the 
homeobox genes /fOAB2 and H0XB7, were highly amplified in a 
previously undescribed independent amplicbn at 17q21.3. HOXB7 
was systematically amplified (as validated by FISH, Fig. 35, inset) 
as well as overexpressed (as verified by RT-PCR, data not shown) 
in BT-474, UACC812, and ZR-75-30 cells. Furthermore, this novel 



extended to include neigjiiboiing nonamplified clones (ratio, <1.5). The am- 
plicon size detennination was partially dependent on local clone density. 

FISH. Dual-color interphase FISH to breast cancer cell lines was done as 
'described (17). Bacterial artificial chromosome clone RP11-361K8 was la- 
beled with SpectrumOrange (Vysis, Downers Grove, IL), and Spectrum- 
Orange-labeled probe for EGFR was obtained from Vysis;. SpectrumGreen- • 
labeled chromosome. 7 and 17 centromere probes (Vysis) were used as a 
reference. A tissue microarray containing 612 formalin-fixed, paraffin-embed- 
ded primary breast cancers (17) was applied in FISH analyses as described 
(18). The use of these specim^ens was approved by the Ethics Committee of the 
University of Basel and by the NIH. Spiecimens containing a 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere signals, in at least 10% of the tumor cells were considered to be 
amplified. Survival analysis was performed \ising the Kaplan-Meier method 
and the log-rank test. ^ 

RT-PCR The HOXB7 expression level was determined relative to 
GAPDH, Reverse transcription and PCR amplification were performed using 
Access RT-PCR System (Promega Corp.. Madison, WI) widi 10 ng of mRNA 
as a terhplate. H0XB7 primers were 5'-GAGCAGAGGGACTCGGACTT-3' 
and 5'-GCGTCAGGTAGCGATTGTAG-3'. 

RESULTS 

Global Effect of Copy Number on Gene Expression. 13,824 
arrayed cDNA clones were ajpplied for analysis of gene expression 
. and gene copy number (CGH microarrays) in 14 breast cancer cell 
lines. The results illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the highly amplified 
transcripts (CGH ratio, >2.5) were overexpressed (i.e., belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig. 1 A). Conyer^ly, 10.5% 
of the transcripts with high-level . expression (cDNA ratio, > 10) 
showed increased copy number (Fig. IB). Low-level copy number 
increases and decreases were also, associated with similar, although 
less dramatic, outcomes on gene expression (Fig. I). 

Identification of Distinct Breast Cancer Amplicons. Base-pair 
locations obtained for 1 1,994 cDNAs (86.8%) were used to plot copy 
n\miber changes as a function of genomic position (Fig. 2, Supple- 
ment Fig. A); The average spacing of clones throughout the genome 
was 267 kb. This high-resohition mapping identified 24 independent 
breast cancer amplicons, spanning from 0.2 to 12 Mb of DNA (Table 
1). Several amplification sites detected previously by chromosomal 





. Fig. 3. Aimotarion of gene expression data on CGH microanBy profiles. A, genes in the 
7pl l-pl2 ampUcon in the MDA-468 ceU line are highly expressed {red dots) and inchidc 
the £G/7{ oncogene. 5. several genes in the I7ql2, 17q2I J, and I7q23 amplicons in the 
BT-474 breast cancer cell line are highly overexpressed (red) and inchjde the H0XB7 
gene. The data labels and color coding are as indicated for Fig. 2C. Insets show 
chromosomal CGH profiles for the corresponding' chromosomes and validation of the 
increased copy nunober by interphase FISH using EGFR {red) and chromosome 7 
centrtRnere probe {green) to MDA-468 {4) and //OAB7-specific probe {red) and chro- 
mosome 17 centromere (green) to BT-474 cells {B). 
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Fig. 4. List of 50 genes with a statistically 
significant correlation (a value <0.05) between 
gene copy number and gene eiqiression. Name, 
chromosomal location; and the a value for each 
gene are indicated. The genes have been ordered 
. according to their position in the genome. The color 
maps on the ngA/ illustrate the copy number and 
expression ratio patterns in the 14 cell lines. The 
. key to the color code is shown at the bottom of the 
graph. Gray squares, missing values. The complete 
list of 270 genes is shown in supplemental Fig. B. 
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amplification was validated to be present in 10.2% of 363 primary 
breast cancers by FISH to a tissue niicroarray and was associated 
with poor prognosis of the patients (P = 0.001). 

Statistical . Identification and Characterization of 270 Highly 
Expressed Genes in Amplicons. Statistical comparison of expres- 
sion levels of all genes as a function of gene amplification identified 
270 genes whose expression was significantly influenced by copy 
number across all 14 cell Imes (Fig. 4, Supplemental Fig. B). Accord- 
ing to the gene ontology data,' 91 of die 270 genes represented 
hypothetical proteins or genes with no functional iannotation, wliereas 
179 had associated functional information available. Of these, 151 
(84%) are implicated in 2^>optosis, cell proliferation, signal transduc- 
tion, and transcription, whereas 28 (16%) had functional annotations 
that could not be directly linked with cancer. 



DISCUSSION 

The importance of recurrent gene and chromosome copy number 
changes in the development and progression of solid tumors has been 
characterized in >1000 publications applying CGH^ (9, 10), as well 
as in a large number of other molecular cytogenetic, cytogenetic, and 
molecular genetic studies. The effects of these somatic genetic 
changes on gene expression levels have remained largely imknown, 
although a few studies have explored gene expression changes occur- 
ring in specific amplicons (15, 19-21). Here, we applied genome- 
wide cDNA microarfays to identify transcripts whose expression 
changes were attributable to imderlying gene copy number alterations 
in breast cancer. 

The overall impact of copy nxmiber on gene e3q)ression patterns was 
substantial with the most, dramatic effects seen in the case of high- 



' Internet address: httpyAvww.geneontology.org/. 



' Internet address: fattpyAvwwj)d)i.nlinjiih.gov/entrez. 
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level copy number increase. Low-level copy number gains losses 
also had a significant influence on expression levels of genes in the 
regions affected, but these effects were more subtle on a gene-by-gerie 
basis than those of high-level amplifications. However, the impact of 
low-level gains on the dysregulation of gene expression patterns in 
cancer may be equally important if not more important than that of 
high-level ampliflcatioiis. Aneuploidy and low-level gains and losses 
of chromosomal arms represent die most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on many genes. Our results in breast cancer extend the recent 
studies on the impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a prostate cancer 
model system (22-24). 

The GGH microarray analysis identified 24 independent breast 
cancer ampHcons. We defined the precise boundaries for many am- 
plicons detected previously by chromosomal CGH (9, 10, 25, 26) and 
also discovered novel amplicons that had not been detected previ- 
ously, presumably because of their small size (only 1-2 Mb) or close 
proximity to other larger amplicons. One of these novel amplicons 
involved the homeobox gene region at 17q21.3 and led to die over- 
expression of the HOXB7 and HOXB2 genes. The homeodomain 
transcription factors are known to be key regulators of embryonic 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27, 28). HOXB7 transfection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and increased 
tumorigenicity and angiogenesis in breast cancer (29-32). The pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexpressing /rOAB7 in breast cancer and suggest that 
H0XB7 contributes to tumor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of H0XB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poor prognosis 
of the patients. ^ 

We carried out a systematic search to identify genes whose 
expression levels across all .14 cell lines were attributable to 
amplification status. Statistical analysis revealed 270 such genes 
(representing —2% of all genes on the array), including not only 
previously described amplified genes, such as HER-2y MYC^ 
EGFR, ribosomal protein s6 kinase, and AIB3y but also numerous 
novel jgenes such as NRAS-related gene (Ipl 3), syndecan-2 (8q22), 
and bone morphogenic protein (20ql3.1), whose activation by 
amplification may similarly promote breast cancer progression. 
Most of the 270 genes have not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms. Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis, cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directly imply a possible role in cancer progression. 
Therefore, a detailed characterization of these genes may provide 
biological insights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we. demonstrate application of cDNA naicroarrays 
to the analysis of both copy number and expression levels of over 
12,000 transcripts throughout the. breast cancer genome, roughly 
once every 267 kb. This analysis provided: (o) evidence of a 
prominent global influence of copy number changes on gene . 
expression levels; {b) a high-resolution map of 24 independent 
amplicons in breast cancer; and (c) identification of a set of 270 
genes, the overexpression of which was statistically attributable to 
gene amplification. Characterization of a novel amplicon at 
17q21.3 implicated amplification and overexpression of the 
HOXB 7 gtne in breast cancer, including a clinical association 



between H0XB7 amplification and poor patient prognosis. Overall, 
our results illustrate how the identification of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and 
validate putative targets for therapy development. 



/ 
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Genome-wide Study of Gene Copy Numbers, 
Transcripts, and Protein Levels in Pairs of 
Non-invasive and Invasive Human Transitional 
Cell Carcinomas* 

Torben F. 0rritoftt§, Thomas ThykjaerU, Frederic M. WaldmanH, Hans Wolf**, 
and Julio E. Celis44: 



Gain and loss of chromosomal material is characteristic 
of bladder cancer, as well as malignant transformation in 
general. The consequences of these changes at both the 
transcription and translation levels Is at present unknown 
partly because of technical limitations. Here we have at- 
tempted to address this question in pairs of non-invasive 
and invasive human bladder tumors using a combination 
of technology that included comparative genomic hybrid- 
ization, high density oligonucleotide array-based mionitor- 
ing of transcript levels (5600 genes), and high resolution 



phenomenon at both the transcription and translation levels. 
High throughput array studies of the breast cancer celf line 
BT474 has suggested th^t there is a corrielatlon between 
DtsIA copy numbers and gene expression in highjy amplified 
areas (2), and studies of individual genes in solid tumors 
have revealed a good correlation between gene dose and 
mRNA or protein levels in the case of c-erb-B2. cyclin d1, 
emsl, and N-myc (3^5). However, a high cyclin D1 protein 
expression has been observed without simultaneous am- 



two-dimensional gel electrophoresis/the results ihowed^'ification (4)^ and a low level of c-myc copy number in- 



that there is a gene dosage effect viat in some cases 
superimposes ori other regulatory mechanisms. This ef- 
fect depended (p < 0.015) on the magnitude of the com- 
parative genomic hybridization change. In general (18 of 
23 cases), chromosomal areas with more than 2-fold gain 
of DNA showed a corresponding increiase in mRNA tran- 
scripts. Areas with loss of DNA, on the other hand, 
showed either reduced or unaltered transcript levels) Be- 
cause most proteins resoWed by. two-^dimensional gels 
are unlcnown it was only possible to compare mRNA and 
protein alterations irt relatively few cases of well focused 
abundant proteins, ^ith few exceptions we found a good 
correlation (p < 0.005) between transcript alterations and 
protein levels. The implications, as well as limitations, 
of the approach are discussed. Molecular & Cellular 
Proteomics 1:37^45, 200Z 



Aneuploidy is a common feature of most human cancers 
(1), but little is known about the genome-wide effect of this 
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crease was observed without concomitant c-myc protein 
overexpression (6). 

In human bladder tumors, karyotyping, fluorescent in situ 
hybridization, arid comparative genomic hybridization (CGH)'' 
have revealed chromosomal aberrations that seem to be 
characteristic of certiain stages of disease progression. In the 
case of non-invasive pTa transitional cell carcinomas (TCCs), 
this includes loss of chromosome 9 or parts of it, as well as 
loss of Y In males. In minimaliy invasive pTI TCCs, the fol^ 
lowing alterations have been reported: 2q-, Hp-, 1q+, 
11q13+, 17q+, and 20q+ (7-12). It has been suggested that 
these regions harbor tumor suppressor genes and oncor 
genes; however, the large chromosomal areas involved often 
contain many genes, making meaningful predictions of the 
functional consequences of losses and gains very difficult. 

In this investigation we have combined genome-wide tech- 
nology for detecting genomic gi^ns and losses (CGH) with 
gene expression profiling techniques (microarrays and pro- 
teomics) to determine the effect of gene copy number on 
transcript and protein levels in pairs of non-Invasive and in- 
vasive human bladdei- TCCs. 

EXPERIMENTAL PROCEDURES 

Materia/— Bladder tumor biopsies were sampled after infomied 
consent was obtained and after removal of tissue for routine pathol- 
ogy examination. By . light microscopy tumors 335 and 532 were 
staged by an experienced pathologist as pTa (superficial papillary. 

The abbreviations used are: CGH, comparative genomic hybrid- 
ization; TCC. transitional cell carcinoma; LOH, toss of heterozygoaty; 
PA-FABP. psoriasis-associated fatty ack5-bindtng protein; 2D. 
two-din>ertsional. 
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Fig. 1. DNA copy number and mRNA expression level. Shown irom left to right are chromosome {Chr.), CGH profiles, gene location and 
expression level of specific genes, and overall expression level along the chromosome. A, expression of priRNA in invasive tumor 733 as 
compared with the non-invasive counterpart tumor 335. S, expression of mRNA in invasive tumor 827 compared vvith the non-invasive 
counterpart tumor 532. The average fluorescent signal ratio between tumor DNA and normal DNA is shown along the length of the chromosome 
Heft). The bold curve in the ratio profile represents a mean of four chromosomes and Is surrounded by thin curves indicating one standard 
deviation. The central vertical lirie (Jjroken) indicates a ratio value of 1 (no change), and the vertical lines next to it {dotted) indicate a ratio of 
0.5 i/eff) and 2.0 ip'ghf). In chromosomes where the non-invasive tumor 335 used for comparison showed alterations In DNA content, the ratio 
profile of that chrorhosome is shown to the right of the invasive tumor profile. The colored bars represents one gene each, identified by the 
running numbers above the bars (the name of the gene can be seen at www.MDLDK/sdata.htmp. The bars indicate the purported location of 
, the gene, and the co/d/s indicate the expression level of the gene in the invasive tumor compared with the non-invasive counterpart; >2-fold 
increase (b/ac/^, >2-fold decrease (blue), no significant change (orange). The bar to the far right, entitled Expression shows the resulting change 
in expression along the chromosome; the colors indicate that at least half of the genes were up-regu|ated (b/acAc), at least half of the genes 
down-regulated (b/ue), or more than half of the genes are unchanged {orange). If a gene was absent in one of the samples and present in 
another, it was regarded as more than a 2-fold change. A 2-fold level was chosen as this corresponded to one standard deviation in a double 
determination of -1800 genes. Centromeres and heterochromatic regions were excluded frorri data analysis. 



grade I and II. respectively, tumors 733 and 827 were staged as pTI 
(invasive into submucosa), 733 was staged as solid, and 827 was 
staged as papillary, both grade 111. 

mRNA Preparation —Tissue biopsies, obtained fresh from surgery, 
were emt>edded immediately in a sodium-guanidinium tliipcyanate 
solution and stored at -80 '*C. Total RNA was isolated using the 
RNAzol B RNA isolation method (WAK-Chemie Medical GMBH). 
poly(^* RNA was isolated by an.oligo(dT) selection step (Oligotex 
mRNA kit; Qiagen). 

cRNA Preparation ^g of mRNA was used as starting mat^al. 
The first and second strand cDNA synthesis was performed using the 
Superscript® choice system (Invrtrogen) according to the manufac- 
turer's instructior^ tnit using an oligo(dT] primer containing a T? RNA 
polymerase binding site. Labeled cRNA was prepared using the ME-. 
GAscrip® in vitro transcriptioii kit (Amt>ion). Biotin-labeled CTP and 



LTTP (Enzb) was used, together with unlabeled NTPs in the reaction. 
Following the in vitro transcription reaction, the unincorporated nu- 
cleotides were removed using RNeasy columns (Qiagen). 

Anay Hybridization and Scanning— Array hybridization and scari^ 
ning was modified from a previous method (13). 10 ;ig of cRNA was 
fi^gmented at 94 "C for 35 min in buffer containing 40 mM Tris 
acetate, pH 8.1, 100 mM KOAc, 30 mM MgOAc. Prior to hybridization, 
the fragmented cRNA in a 6x SSPErT hybridization buffer (1 m NaQ. 
10 mM Tris, pH 7.6, 0.005% Triton), was heated to 95 "^C for 5 min, 
subsequently cooled to 40 ''C. and k>aded onto the Affymetrix probe 
array cartridge. The prot>e array was then incubated for 1 6 h at 40 *C 
at coristant rotation (60 rpm). The probe array was exposed to 10 
washes in 6x SSPE-T at 25 °C followed by 4 washes in 0.5x SSPE-T 
dt 50 X. The biottnyiated cRNA was stained with a streptavidiri- 
phycoerythrin conjugate, 10 ;ig/ml (Molecular Probes) in 6x SSPE-T 
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Fra. 1 — continued 



for 30 min at 25 "C followed by 1 0 washes in 6 x SSPE-T at 25 "C. The 
probe arrays were scanned at 560 nm using a confoca) laser scanning 
microscope (made for Affymetrix by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Affymetrix gene 
expression analysis software. 

Microsateilite Ana/ys/s—Microsatetlite Analysis was performed as 
described previously (14). Mlcrosatellites were selected by use of 
vmw.ncbi.nlm.ni'h.gbv/genemapOS, and primer sequences were ob- 
tained from the genome data base at www.gdb.org. DNA was extracted 
from tumor and blood and amplified by PGR in a volume of 20 ^tl for 35 
cycles. The amplicons were denatured and electrophoresed for 3 h in an 
ABI Prism 377. Data wot collected in the Gene Scan program for 
fragment analysis. Loss of heterozygosity was defined as less than 33% 
of one allele detected tn tumor amplicons compared with blood. 

Proteomic Anatysis —TCCs were minced into small pieces and 
homogenized in a small glass homogehizer in 0.5 rinl of lysis solution. 
Samples were stored at -20 *C until use. The procedure for 2D gel 
electrophoresis has been descrit^ in detail elsewhere (15, 16). Gels . 
were stained v\nth silver nitrate and/or Coomassie Brilliant Blue. Pro- 
tans were identrfied by a comt>tnation of procedures that included 
microsequericing, mass spectrometry, two-dimensional gel Western. 
immunot}lotting, and comparison with the nnaster two-ndimensional gel 
image of human keratinocyte proteins; see k»obase.dk/cgi-bin/ceSs. 

CGH— Hybridization of differentially labeled tumor and nom^ DNA 
to normal metaphase chromosomes was perfomied as described 
previously (10). Ruorescein-tabeted tumor DNA (200 ng), Texas Red- 



labeled reference DNA (200 ng); and human Cot-1 DNA (20 pig) were 
denatured at 37 "C for 5 min and applied to denatured nomnal met- 
aphase slides. Hybridization was at 37 °C for 2 days. After washing, 
the slides were counterstained vwth 0.15 pig/ml 4,6-diamidino-2-phe- 
nylindole in an anti-fade solution. A second hybridization was per- 
formed for all tumor samples using fluorescein-labeled reference DNA 
and Texas Red-labeled tumor DNA (inverse lat>eling) to confirm the 
aberrations detected during the initial hybridization. Each CGH ex- 
periment also included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA, Digital image analysis was 
used to identify chromosomal regions with abnorrnal fluorescence 
ratios, indicating regions of DNA gains and kisses. The average 
greenired fluorescence intensity ratto profiles were cateulated using 
four images of each chromosome (eight chromosomes totaQ v«th 
normalization of the green:red fluorescence intensity ratio for the 
entire metaphase and background correctton. Chromosome kiwitifi- 
catton was perfomied based on 4.6-diamidino-2-phenylindole band- 
ing patterns. Only images showing uniform high intensity fluores- 
cence with minimal t>ackground staining were analyzed. All 
centromeres, p arms of acrocerrtric chrorhbsonries, and heterochro- 
matic regions were excluded from the anal^is. 

RESULTS 

Comparative Genomic Hybrid'tzation— The CGH arialysis 
identified a number of chromosomal gains and losses in the 
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Table I 

Correlation between alterations detected by CGH and by expression monitoring 

Top, CGH used as independent variable (if CGH aKeration - what expression ratio was found); tjottom, altered expression used as 
Independent variable (if expression alteration - what CGH deviation was found). ' 



CGH alterations 



Tumor 733 vs. 335 
Expression change clusters 



Concordance 



CGH alterations 



Tumor 827 vs. 532 
Expresslbh change clusters 



Concordance 



13 Gain 



'10 Loss 



10 Up-regulation 

0 Down-regulation 

3 No change 

1 Up-regulation 

5 Down-regulation 

4 No change 



77% 



50% 



10 Gain 



12 Loss 



8 Up-regulation 
0 Down-regulation 

2 No change 

3 Up-regulation 

2 Down regulation 
7 No change 



80% 



17% 



Expression change clusters 


Tumor 733 vs. 335 


Concordance 


Expriession change clusters 


Tumor 827 vs. 532 


Cpr^cordance 


CGH alterations 


CGH alterations 


16 Up-regulation 


11 Gain 


' 69% 


17 Up-regulation 


10 Gain 


.59% 




2 Loss 






5 Loss 






3 No change . 






2 No change 




21 Down-regulation 


1 Gain 


38% 


9 Down-regulation 


OGain 


33% 




8 Loss 






3 Loss 






12 No change 






6 No change 




15 No change 


3 Gain 


60% 


21 No change 


1 Gain 


81% . 




3 Loss 






3 Loss 






9 No change 






17 No change 





two invasive tumors (stage pT1, TCCs 733 and 827), whereas 
the two non-invasive papillonnas (stage pTa, TGGs 335 and 
532) showed only 9p-, 9q22-q33-, and X-. and 7+, 9q-, 
and Y-, respectively. Both invasive tunriors showed changes 
(1q22-24+; 2q14.1-qter-. 3q12-q13.3-. 6q12-q22-, 
9q34+, 11q12^q13+.^17+, and 20q11.2-q12+) that are typ- 
ical for their disease stage, as well as additional alterations, 
some of which are shown in Fig. 1. Areas with gains and 
losses deviated from the normal copy number to somie extent, 
and the average numerical deviation from normal was 0.4-fold 
in the case of TOG 733 and 0,3-fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1q23 in TCC 733 (Rg. 1>A) and 
2Gq12 inTCC827 (Rg. IB). 

mRNA Expression in Relation to DNA Copy Number— The 
mRNA levels from the two invasive tumors (TCCs 827 and 
733) were compared with the two non-invasive counterparts 
(TCCs 532 and 335). This was done in two separate experi- 
ments in which we compared TCCs 733 to 335 and 827 to 
532, respectively, using two different scaling settings for the 
arrays to rule out scaling as a confounding parameter. Ap- 
proximately 1 ,800 genes that yielded a signal on the an^ays 
were searched in the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way it was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the indi- 
vidual chromosomes (Fig. 1). 

For each mRNA a ratio was calculated between the level in 
the invasive versus the non-ihvasive counterpart. Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Fig. 1). The density 
of genes along the chromosomes varied, and areas contain- 
ing only one gene were excluded from the calculations. The 
resolution of the QGH method is very low, and some of the 
outlier data may be because of the fact that the boundaries of 
the chromosomal aberrations are not knovm at high resolution. 

Tvvo sets of calculations were made from the data. For the 
fihst set we used CGH alterations as the independent variable 
and estimated the frequency of expression alterations in these 
chromosomal areas. In general, aireas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chromo- 
somes 1q21-q25, 2p and 9q, showed a relative gain of more 
than 100% in DNA copy number that was accompanied by 
increased mRNA expression levels in the two tumor pairs (Fig. 
1). In most cases, chromosomal gains detected by CGH were 
accompanied by an increased level of transcripts in both 
TCCs 733 (77%) and 827 (80%) (Table I, fop). Chromosomal 
losses, on the other hand, , were not accompanied by de- 
creased expression in seyeral cases, and were often regis- 
tered as having unaltered RNA levels (Table I, fop). The inabil- 
ity to detect RNA expression changes In these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the independent variable and es- 
timated the frequency of CGH alterations in these areas: As 
alDove^ we found that Increased transcript expression corre- 
lated with gain of chronriosomal material (TCC 733, 69% and 
TCC 827, 59%), whereas reduced expression was often de- 
tected in areas with unaltered CGH ratios (table I, bottom). 
Furthermore, as a control we looked at areas with no atter- 
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Fig. 2. Correlation between maximum CGH aberration and the ability to detect expression charige by oligonucleotide array 
monitoring. The aberration is shown as a numerical -fold change in ratio between invasive tumors 827 (A) and 733 {♦) and their non-invasive 
counterparts 532 and 335. The expression change was taken from the Expression line to XUe right in Rg. 1, which depicts the resulting 
expression change for a given chromosomal region. At least half of the mRNAs from a given region have to be either up- or down-regulated 
to be scored as an expression change. All chromosomal arms in which the CGIH ratio plus or minus one standard deviation was outside the 
ratio value of one were included. 



ation in expression. No alteration was detected by CGH In 
most of these areas (TCC 733, 60% and TCC 827; 81%; see 
Table I, bottom). Because the ability to observe reduced or 
increased mRNA expression clustering to a certain chromo- 
somal area cleariy refledted the extent of copy number 
changes, we plotted the maxihrium CGIH aberrations in the 
regions showing CGH changes against the ability to detect a 
charige in mRNA expression as monitored by the oligonucleo- 
tide an-ays (Fig. 2)(Es>r both tumors TCC 733 (p < 0.015) and 
TCC 827 (p < 0.00003) a highly significant congelation was 
observed between the level of CGH ratio change (reflecting 
the DKIA copy number) and alterations detected by the array 
based technology (Fig. ^ Similar data were obtained when 
areas with altered expression were used as independent vari- 
ables. These areas conrelated best with CGH when the CGH 
ratio deviated 1.6- to 2.0-fold (Table I, bottom) but mostly did 
not at lower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction in expres- 
sion level, which is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 
much larger extent. 

Micmsatellite-t>ased Detection of Minor Areas of. Loss- 
es—In TCC 733, several chromosomal areas exhibiting DNA 
amprrficatibn were preceded or followed by areas with a nor- 
mal CGH but reduced mRNA expression (see Rg. 1 , TCC 733 
chromosome 1q32, 2p21, and 7q21 and q32, 9q34, and 
10q22). To determine whether these results were because of 
undetected loss of chromosomal material in these regions or 



because of other non-structural mechanisms regulating tran- 
scription, we examined two microsatellites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22. Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
indicating that minor deleted areas were not detected with the 
resolution of CGH (Rg. 3). Additionally, chromosome 2p in 
TCC 733 showed a CGH pattern of gain/no change/gain of 
DNA that correlated with transcript increase/decrease/in- 
crease.. Thus, for the areas showing inaeased expression 
there was a congelation with the DNA copy number alterations 
(Fig. 1>4). As indicated above, the mRNA decrease observed in 
the middle of the chromosomal gain was because of LOH, 
implying that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomeric end of chro- 
mosome lip showed a normal ratio in the CGH analysis; 
howiever, clusters of five and three genes, respectively, lost 
their expression. Two microsatellites (D11 SI 760, D11S922) 
positioned dose to.MUC2, IGF2, .and cathepsin D indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed in TCC 733 at 
chromosomes 3q24. 11p11, 12p12.2, 12q21:1. and 16q24 
and in TCC 827 at chronwsome 11p15.5, 12p11, 15q11.2. 
and 18q12 was also examined for chromosomal losses using 
microsatellites positioned as close as possible to the gene loci 
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Fig; 3. Microsatellite analysis of loss of heterozygosity. Tumor 
733. showing loss of heterozygosity at chromosome 1q25, detected 
(a) by D1 821 5 close to Hu class 1 histocompatibility antigen (gene 
number 38 In Rg. 1), (6) by D1S2735 close to cathepsin E (gene 
number 41 in Rg. 1), arid (c) at chromosome 2p23 by D2S2251 close 
tb general ^spectrin (gene number 1 1 on Rg. 1) and of (d) tumor 827 
showing loss of heterozygosity at chromosome 18q12 by S18S1 118 
close to mitochondrial 3-oxoacyI-coenzyme A thiotase (gene number . 
12 in Rg. 1). The upper curves show the electropherogram obtained 
from normal DNA from leukocytes (W), and the lower curves show the 
electropherogram from tumor DNA (7). In all cases one allele is 
partially lost in the tumor amplicon. 

showing reduced mRNA transcripts. Only the microsatellite 
positioned at 18q12 showed LOH (Rg. 3), suggesting that 
trar^scriptionai down-regulation of genes in the other regions 
may be controlled by other mechanisms. 

Relation between Changes in mRNA and Protein Levels— 
2D-PAGE analysis, in combination with Coomassie Brilliant 
Blue and/or silver staining, was carried out on ail four tumors 
using fresh biopsy material. 40 well resolved abundant known 
proteins migrating in areas away from the edges of the pH 
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Rg. 4. Correlation between protein levels as judged by 2D- 
PAGE and transcript ratio. For comparison proteins were divided in 
three groups, unaltered In level or up- or down-regulated ifiorizontal 
axis). The mRNA ratio as detennfiined by oligonucleotide arrays was 
plotted for each gene {vertical axis). A, mRNAs that were scored as 
present in both tumors used for the ratio calculation; A, mRNAs that 
were scored as Absent in the Invasive tumors (along tiorizontal axis) or 
as absent in ndn-invasive reference (fop of figure). Two different 
scalings were used to exclude scaling as a confounder, TCCs 827 
and 532 (AA) were scaled with background suppression, and TCCs 
733 and 335 (90) were scaled without suppression. Both compari- 
sons showed highly significant (p < 0.005) differences in mRNA ratbs 
between the groups. Proteins shown were as follows: Group A (from 
left), phosphoiglucomutase 1 , glutathione transferase class ^ number 
4, fatty acid-binding protein homologue, cytpkeratin 15, and cyto- 
keratin 1 3; S (from /eft), fatty acid-binding protein homologue, 28-kDa 
heat shock protein, cytokeratin 13, and calcyclin; C <from /eft), a-eno- 
lase, hnRNP BI,' 28-kDa heat shock protein, 14-3-3-c and 
pre-mRNA splicing factor; D, mesothelial keratin K7 (type IQ; E (from 
top), glutathione S-transferase-77 arid mesothelial keratin K7 (type II); 
F(from fop and /eft), adenytyl cyclase-assoclated protein, E-cadherin, 
keratin 19, calgizzarin, phosphoglycerate mutase, annexin IV, cy- 
toskeletal 7-actin, hnRNP A1. integral membrane protein calnexjn 
(IP90), hnRNP H, brain-type clathrin light chain-a. hnRNP F, 70-kDa 
heat shock protein, heterogeneous nuclear ritK>nucleoprotein A/B, 
translationalty controlled tumor protein, liver gtyceraldehyde-3-phos- 
phate dehydrogenase, keratin 8; aldehyde reductase, artd Na,K- 
ATPase 0-1 subunit; G, (from fop and /eft), TCP20, calgizzarin,. 70- 
kDa heat shock protein, calnexin, hnRNP H, cytokeratin 15, ATP 
synthase, keratin 19. triosephosphate isomerase, hnRNP F, liver glyc- 
eraJdehyde-3-phosphatase dehydrogenase, glutathione Srtransfer- 
ase-ir, and keratin 8; H (from left), plasma gelsolin, autoantigen cal- 
reticulin, thioredoxin, and NAD+-dependent 15 hydroxyprostaglar>din 
dehydrogenase; / (from fop), prolyl 4-hydroxylase 0-subuntt, cyto- 
keratin 20, cytokeratin 17, prohibition, arid fructose 1,6-biphos- 
phatase; J annexin II; K, annexin IV; L (from fop and left), 90-kDa heat 
shock protein, prolyl 4-hydroxylase 0-subunit, o-enolase, GRP 78, 
cyck>philin, and cofilin. 

gradient, and having a known chromosomal location, v^^ere 
selected for analysis in the TCC pair -827/532. Proteins were 
identified by a combination of methods (see "Experimental 
Procedures"). In general there was a highly significant corre- 
lation (p < 0.005) between mRNA and protein alterations (Rg. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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Fig. 5. Comparison of protein and transcript levels in invasive 
and non-invasive TCCs. The upper part oi the figure shows a 2D gel 
{Jeft) and the oligonucleotide array (f/gf/ji) of TCC 532. This red rectan- 
gles on the upper gel highlight the areas that are compared below. 
Identical areas of 2D gels of TCCs 532 and 827 are shown tDelow. 
Clearly, cytokeratins 13 and 15 are strongly down-regulated in TCC 
827 (red annotatidr)). The tile on the array containing probes for 
cytokeratin 15 is enlarged t)elow the an^y (red arrow) from TCC 532 
and is compared with TCC 827. The upper row of squares in each tile 
con'esponds to perfect match probes; the lower row corresponds to 
mismatch probes containing a mutation (used for correction for un- 
specific binding). Absence of signal is depicted as black, and the 
higher the signal the lighter the color. A high transcript level was 
detected in TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (abserice of signals). For cytokeratin 13, a high 
transcript level was also present in TCC 532 (15659 units), and a 
much lower level was present in TCC 827 (623 units). The 2D gels at 
the bottom of the figure (te^O show levels of PA-FABP and adlpocyte- 
FABP in TCCs 335 and 733 (invasive), respectively. Both proteins are 
down-regulated in the invasive tumor. To the right we show the array 
tiles for the PA-FABP trariscript. A medium transcript level was de- 
tected in the case of TCC 335 (1277 units) whereas very tow levels 
were detected in TCC 733 (166 units). /Ef, isoelectric focusing. 



keratiris encoded by genes on chronnosome 17 (Fig. 5) the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes had a know chronnosomal 
location were detected in TCCs 733 arid 335, and of these 19 
con-elated (p < 0.005) with the mRNA changes detected using 
the arrays (Rg. 4). For example, PA-FABP was highly ex- 
pressed in the non-invasive TCX3 335 but lost in the invasive 
counterpart (TCC 733; see Rg. 5). The smaller number of 
proteins detected in both 733 and 335 was because of the 
smaller size of the biopsies that were available. 

11 chromosomal regions where CGH showed aberrations 
that corresponded to the changes in transcript levels also 
showed con^esponding changes in the protein level (Table II). 
These regions included genes that encode proteins that are 
found to be frequently altered in bladder cancer, namely 
cytokeratlns 17 and 20, annexins H and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1. Four of these pro- 
teins were encoded by genes in chromosome 17q, a fre- 
quently amplified chromosomal area in invasive bladder 
cancers. 

DISCUSSION 

Most human cancers have abnormal DMA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression in two pairs of non-invasive 
and invasive TCCs using high throughput expression arrays 
and proteomics, in combination with CGH. In general, the 
results showed that there is a clear individual regulation of the 
mRNA expression of single genes, which In some cases was 
superimposed by a DNA copy nunnber effect. In most cases, 
genes located in chromosomal areas with gains often exhib- 
ited increased rhRNA expression, whereas areas showing 
losses showed either no change or a reduced mRNA expres- 
sion. The latter might be because of the fact that losses most 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In; several cases, how- 



Table II 

Proteins whose expression level correlates with both mRNA and gene dose changes 



Protein 



Chromosomal location Tumor TCC CGH alteration Transcript alterationT Protein alteration 



Annexin It 


1q21 


733 


Gain 


Abs to Pres^ 


Increase 


Annexin IV 


2pi3 


733 


Gain 


3.9-FokJ up 


Increase 


Cytokeratin 17 


17q12-q21 


827 


Gain 


3.8-FpId up 


Increase 


Cytokeratin 20 


17q21.1 


827 


Gain 


5.6-Fold up 


Increase 


,{PA-)FABP 


8q21.2 


827 


Loss 


10-Fold down 


Decrease 


. FBP1 


9q22 


827 


Gain 


2.3-Fold up 


Increase 


Plasma gelsolin 


9q31 


827 


Gain 


. Abs to Pres 


Increase 


Heat shock protein 28 


15q12-q13 


827 


Loss 


2.5-Fold up 


Decrease 


Prohibitin 


17q21 


827/733 


jGain 


3.7-/2.5-Fold up^ 


Increase 


ProIyl-4-hydroxyl 


17q25 


827/733 


Gain 


. 5.7-/1 .6-Fold up 


Increase 


hnRNPBI 


7p15 


827 


Loss 


2.5-FokJ down 


Decrease 



* Abs. at>sent; Pres. present. 

In cases where the corresponding alterations were found in both TCCs 827 and 733 these are shown as 827/733. 
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ever, an increase or decrease in DMA copy nunnber was 
associated with de novo occurrence or complete loss of tran- 
script, respectively. Some of these transcripts could not be 
detected in the non-invasive tumor but were present at rela- 
tively high levels in areas with DMA amplifications in the inva- 
sive tumors (e.g. in TCC 733 transcript from cellular ligand of 
annexin H gene (chromosome 1q21) from absent to 2670 
arbitrary units; in TCC 827 transcript from small proiine-rich 
protein 1 gene- (chromosome 1q12-q21.1) from absent to 
1326 arbitrary units). It may be anticipated from these data 
that significant clustering of genes with an increased expres- 
sion to a certain chromosomal area indicates an increased 
likelihood of gain of chromosomal material in this area. 

Considering the many possible regulatory mechanisms act- 
ing at the level of transcription, it seems striking that the gene 
dose effects were so clearly detectable in gained areas. One 
hypothetical explanation may lie in the loss of controlled 
methylatiori in tumor cells (17-19). Thus, it may be possible 
that in chromosomes with increased DNA copy numbers two 
or more alleles could be demethylated simultaneously leading 
to a higher transcription level, whereas in chromosomes with 
losses the remaining allele could be partly methylated, turning 
off the propess (20, 21). A recent report has documented a 
ploidy regulation of gene expression in yeast, but in this case all 
the genes were present in the same ratio (22), a situation that is 
not analogous to that of cancer cells, which show marked 
chromosomal aben^tions, as well as gene dosage effects. 

Several CGH studies of bladder cancel^ have shown that 
some chromosomal aberrations are common at pertain 
stages of disease progression, often occurring in more than 1 
of 3 tumors. In pTa tumors, these include 9p-, 9q-, 1q+, 
(2. 6). and in pT1 tumors, 2q -.11 p-. 11q-, 1q+, 5p-f . 8q+. 
17q-f , and 20q+ (2-4. 6, 7). The pTa tumors studied here 
showed similar aben-ations such as 9p- and 9q22-q33-:and 
9q- and Y-. respectively. Likewise, the two minimal invasive 
pT1 tumors showed aberrations that are commonly seen at 
that stage, and TCC 827 had a remarkable resemblance to the 
commonly seen pattern of losses and gains, such as 1 q22-24 
amplification (seen in both tumors), 1 1q14-q22 loss, the latter 
often linked to 17 q+ (both tumors), and 1q+ and 9p-, often 
linked to 20q+ and 11 q13+ (both tunriors) (7-9). These ob- 
servations indicate that the pairs of tumors used in this study 
exhibit chromosomal changes observed in many tumors, and 
therefore the findings could be of general importance for 
bladder cancer. 

Considering that the mapping resolution of CGH Is of about 
20 megabases it is only possible to get a crude picture of 
chromosomal instability using this technique. Occasionally, 
we observed reduced transcript levels close to or inside re- - 
gions with increased copy numbers. Analysis of these regions 
by positioning heterozygous microsatellites as close as pos- 
sible to the locus showing reduced gene expression revealed 
loss of heterozygosity in several cases, tt seems likely that 
multiple arKj different events occur along each chromosomal 
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arm and that the use of cDNA microan-ays for analysis of DNA 
copy number changes will reach a resolution that can resolve 
these changes, as has recently been proposed (2). The outlier 
data were not more frequent' at the boundaries of the CGH 
abenrations. At present we do not know the mechanism be- 
hind chromosomal aneupioidy and cannot predict whether 
chromosomal gains will be transcribed to a larger extent than 
the two native alleles. A mechanism as genetic imprinting has 
an impact on the expression level in normal cells and is often 
reduced in tumors. However, the relation between imprinting 
and gain of chromosomal material is not known. 

We regard it as a strength of this investigation that we were 
able to compare invasive tumors to benign tumors rather than . 
to normal urothelium. as the tumors studied were biologically 
very close and probably may represent successive steps in 
the progression of bladder cancer. Despite the limited amount 
of fresh tissue available it was possible to apply three different 
state of the art methods. The observed conrelation between 
DNA copy number and mRNA expression is remarkable when 
one considers that different pieces of the tumor biopsies were 
used for the afferent sets of experiments. This indicate that 
bladder tumors are relatively homogenous, a notion recently 
supported by CGH and LOH data that showed a remarkable 
similarity even between tumors and distant metastasis (10. 23). 

In the few cases analyzed. mRNA and protein levels 
showed a striking correspondence although in some cases 
we found discrepancies that may be attributed to translational 
regulation, post-translational processing, protein degrada- 
tion, or a combination of these. Sorne transcripts belong to 
undertranslated mRNA pools, vyhich are associated with few 
translationally inactive ribosomes; these pools, however, 
seem to be rare (24). Protein degradation, for example, may 
be very important in the case of polypeptides with a short 
half-life (e.g. signaling proteins). A poor conrelation between 
mRNA and protein levels was found in liver cells as deter- 
mined by an-ays and:2D-PAGE (25), and a moderate correla- 
tion was recently reported by Ideker ef a/. (26) in yeast, 
(interestingly, our study revealed a much better correlation 
between gained chromosomal areas and increased mRNA . 
levels than between loss of chromosomal areas and reduced 
mRNA levels. In general, the level of CGH change determined 
the ability to detect a change in transcript) One pbssible 
explanation could be that by losing one allele the change In 
mRNA level is not so dramatic as coriipared with gain of 
materi£il, which can be rather unlimited and may lead to a 
severalfold increase in gene copy number resulting in a riiuch 
higher impact on transcript level. The latter would be much 
easier to detect on the expression arrays as the cut-off poirit 
was placed at a 2-fold level so as not to be biased by noise on 
the array. Construction of arrays vwth a better signal to noise 
ratio may in the future allow detection of lesser than 2-fold 
atteratior^ in trartscript levels, a feature that may facilitate the 
analysis of the effect of loss of chromosomal areas on trar>- 
script levels. 
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In eleven cases we found a significant correlation between 
DNA copy number, nnRNA expression, and protein level. Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area in chromosome 17q. Whether DNA 
copy number is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples. One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring immunoidentification and/or mass spectrometry to 
conrectly identify the proteins in the gels, 

In conclusion, the results presented in this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CGH inethod, but in the future high reso- 
lution CGH based on microarrays with many thousand radiation 
hybrid-mapped genes will increase the resolution and informa- 
tion derived from these types of experiments (2). Combined with 
expression anrays analyzing transcripts derived from genes vvlth 
known locations, and 2D gel arialysis to obtain information at 
the ppst-translational level, a clearer and more developed un- 
derstanding of the tumor genome will be forthcoming. 
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Genomic DNA copy number alterations are key genetic events in 
the development and progression of human cancers. Here we 
report a genome-wide microarray comparative genomic hybrid- 
izattoh (array CGH) analysis of DNA copy number variation in 
a series of primary human breast tumors. We have profilied DNA 
copy number alteration aaoss 6,691 mapped human genes, in 44 
predominantly advanced, primary breast tumors and 10 breast 
cancer cell lines. While the overall patterns of DNA amplification 
and deletion corroborate previous cytogenetic studies, the high- 
resolution (gene-by-gene) mapping of amplicon boundaries and 
the quantitative analysis of amplicon shape provide significant 
improvement in the localization of candidate oncogenes. Parallel 
miaoarray measurements of mRNA levels reveal the remarkable 
degree to which variation in gene copy number contributes to 
variation In gene expression in tumor cells. Specifically, we find 
that 62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number Influences gene ex- 
pression across a wide range of DNA copy number alterations 
(deletion, low-, mid- and high-level amplification), that on average, 
a 2-fold change in DNA copy number is associated with a corre- 
sponding I.S^fold change in mRNA levels, and that overall, at least 
12% of all the variation in gene expression among the breast 
tumors is directly attributable to underiying variation in gene copy 
number, these findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation of 
gene expression, which hiay contribute to the development or 
progression of cancer. 

Conventional cytogenetic techniques, including comparative 
genomic hybridization (CGH) (1), have led to the identifi- 
cation of a number of recurrent regions of DNA copy number 
alteration in breast cancer cell lines and tumors (2-4). While 
some of these regions contain known or candidate oncogenes 
[e.g., FGFRl (8pll), MYC (8q24), CCNDl (llql3). ERBB2 
(17ql2), and ZNF217 (20ql3)] and tumor suppressor genes 
[RBI (13ql4) and TP53 (17pl3)], the relevant gene(s) within 
other regions (e.g., gain of Iq, 8q22, and 17q22-24, and loss of 
8p) remain to be identified. A high-resolution genome-wide 
map, delineating the boundaries of DNA coj^ number alter- 
ations in tumors, should facilitate the localization and identifi- 
cation of oncogenes and tumor suppressor genes . in breast 
cancer. In this study, we have created sudi a map, using 
array-based CGH (5r-7) to profile DNA copy number alteration 
in a series of breast cancer cell lines and primary tumors. 

An unresolved question is the extent to which the widespread 
DNA copy number changes that we and others have identified 
in breast tumors alter expression of genes within invoh^ed 
regions. Because we had measured mRNA levels in parallel in 
the same samples (8), using the same DNA microarrays, we had 
an opportunity to explore on a genomic scale the relationship 
between DNA copy number changes and gene expression. From 
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this analysis, we have identified a significant impact of wide- 
spread DNA copy number alteration on the transcriptional 
programs of breast tumors. 

Materials and Methods 

Tumors and Cell Lines. Primary breast tumors were predominantly 
large (>3 cm), intermediate-grade, infiltrating ductal carcino- 
mas, with more than 50% being lymph node positive. The 
fraction of tumor cells within specimens averaged at least 50%. 
Details of individual tumors have been published (8, 9), and 
are summarized in Table 1, which is published as supporting 
information on the PNAS web site, www.pnas.org. Breast cancer 
cell lines were obtained from the American Type Culture 
Collection. Genomic DNA was isolated either using Qiagen 
genomic DNA columns, or by phenol/chloroform extraction 
followed by ethanol precipitation. 

DNA labeling and Microarray Hybridizations. Genomic DNA label- 
ing and hybridizations were performed essentially as described 
in Pollack et al. (7), with slight modifications. Two micrograms 
of DNA was labeled in a total volume of 50 microliters and the 
volumes of all reagents were adjusted accordingly, 'Test" DNA 
(from tumors and cell lines) was f luorescently labeled (Cy5) and 
hybridized to a human cDNA microarray containing 6,691 
different mapped human genes (i.e., UniGene clusters). The 
"reference" (labeled with Cy3) for each hybridization was nor- 
mal female leukocyte DNA from a single donor. The fabrication 
of cDNA microarrays and the labeling and hybridization of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 
on a.GenePbc scanner (Axon Instruments, Foster City, CA), and 
fluorescence ratios (test/reference) calculated using SCANALY^ 
software (available at http://ranaJbl.gov). Fluorescence ratios 
were normalized for each array by setting the average log 
fluorescence ratio for ail array elements equal to 0. Measurer 
ments with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles 
that deviated significantly from background ratios measured in 
normal genomic DNA control hybridizations were interpreted as 
evidence of real DNA copy number alteration (see Estimating 
Significance of Altered Fluorescence Ratios in the supporting 
information). When indicated, DNA copy number profiles are 
displayed as a moving average (symmetric 5-nearest neighbors). 
Map positions for arrayed human cDNAs were assigned 
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Fig. 1. . Genome-wide measurement of DNA copy number alteration by array CGH. (a) DNA copy number profiles are illustrated for cell lines containing different [ 
numbers of X chromosomes, for breast cancer cell lines* and for breast tumors. Each row represents a different cell line or tumor* and each column represents 
one of 6,691 different mapped human genes present on the microarray, ordered by genome map position from Ipterthrough Xqter. Moving average (symmetric 
5-nearest neighbors) fluorescence ratios (test/reference) are depicted using a logrbased pseudocolor scale (indicated), such that red luminescence reflects 
fold-amplification, green luminescence reflects fold-deletion, and black indicates no change (gray. indicates poorly measured data), (b) Enlarged view of DNA 
copy number profiles across the X chromosome, shown for cell lines containing different numbers of X chromosomes. 



identifying the starting position of the best and longest niatch of 
'i any. DNA sequence represented in the corresponding UniGerie 
cluster (10) against the "Golden Path" genome assembly 
(http://genome.ucsc,edu/; Oct 7, 2000 Freeze). For UniGene 
clusters represented by multiple arrayed elements, mean fluo- 
rescence ratios (for iall elements representing the same UniGene 
cluster) are reported. For mRNA measurements, fluorescence 
ratios are "mean-centered" (i;e., reported relative to the mean 
ratio across the 44 tunior samples). The data set described here . 
can be accessed in its entirety in the supporting information. . 

Results 

We performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cDNA microarrays containing 6,691 different niapped human 
genes (Fig. la; also see Materials and Methods for details of 
microarray hybridizations). To take full advantage of the im- 
proved spatial resolution of array CGH, we ordered (fluores- 
cence ratios for) the. 6,691 cDNAs according to the "Golden 
Path" (http://genome.ucsc.edu/) genome assembly of the draft 
human genome sequences (11). In so doing, arrayed cDNAs not ' 
only themselves represent genes of potential interest (e.g., 
candidate oncogenes within amplicons), but also provide precise 
genetic landmarks for chromosomai regions of amplification and 



deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. lb), as we did before 
(7), demonstrated the sensitWity of our method to detect single- 
copy loss (45, XG), and 1.5- (47,XXX),;2. (48,XXXX), or 
2.5-fold (49,XXXXX) gains (also see Fig. 5. which is published 
as supporting information on the PNAS web site). Fluorescence 
ratios were linearly proportional to copy number ratios, which 
were slightly underestimated, in agreement with previous ob- 
servations (7). Numerous DNA copy number alterations were 
evident in both the breast cancer cell lines and primary tumors 
(Fig. la), detected in the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found in every cancer cell line and tumor, and 
- on every human chromosome in at least one sample. Recurrent 
regions of DNA coj^ number gain and loss were readily iden- 
tifiable. For example, gains within Iq, 8q, 17q, and 20q were 
observed in a high proportion of breast cancer cell lines/tumors 
(90%/69%. 100%/47%, 100%/60%, and 90%/44%, respective- 
iy), as were losses within Ip. 3p, 8p, and 13q (80%/24%, 
80%/22%. 80%/22%, and 10%/1S% respectively), consistent 
with published cytogenetic studies (refe. 2-4; a complete listing 
of gains/losses is provided in Tables 2 and 3, which are pubhshed 
as supporting information on the PNAS web site). The total 
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Rg. 2. DNAcopy number atteration across chromosome 8 by array CGH. (a) DNAcopy number profiles are illustrated for cell lines containing different numbers 
of X chromosomes, for breast cancer cell tines, and for breast tumors. Breast cancer cell lines and tumors are separately ordered by hiierarchical clustering to 
highlight recurrent copy number changes. The 241 genes present on the rhicroarrays and mapping to chromosome 8 are ordered by position along the. 
chromosome. Fluorescence ratios (test/reference) are depicted by a logj pseudocolor scale (indicated). Selected genes are indicated with color-coded text (red. 
increased; green, decreased; black, no change; gray, not weli measured) to reflect correspondingly altered mRNA levels (observed In the majority of the subset 
of samples displaying the DNA copy number change). The map posKlons for genes of interest that are not represented on the microarray are Indicated in the 
row above those genes riepresented on the array. (b)^Graphical display of DNA copy number profile for breast cancer cell line SKBR3. Fluorescence r^ios 
(tumor/normal) are plotted on a logj scale for chromosome 8 genes, ordered along the chromosome. 



number of genomic alterations (gains and losses) was found to 
be significantly higher in breast tumors that were high grade {P = 
0.008), consistent with published CGH data (3), estrogen recep- 
tor negative {P = 0.04), and harboring TP53 mutations (P = 
O.06O6) (see Table 4, which is published as supporting informa- 
tion on the PNAS web site). 

The improved spatiaj resolution of our array CGH analysis is 
iUustrated for chromosome 8, which displayed extensive DNA 
copy number alteration in our series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed multiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharacterized oncogene (Fig, 2fl). The complexity of amplicon 
structure is most easily appreciated in the breast cancer cell line 
SKBR3. Although a conventional CGH analysis of 8q in SKBR3 
identified only two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Fig. 2b), For each of these regions we can define the 



boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausible candidate oncogenes 
can be identified (a description of these regions, as well . as the 
recurrently amplified regions on chromosomes 17 and 20, can be 
found in Figs. 6 and 7, which are published as supporting 
information on the PNAS web site). 

For a subset of breast cancer cell lines and tumors (4 and 37, 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in parallel by using cDNA 
microarrays . (8): The parallel assessment of mRNA levels is 
useful in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes within an ampli- 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides us the oppor- 
tunity to assess the global impact of widespread DNA copy 
number alteration on gene expression in tumor cells, 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 
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Rg. 3. Cpncordance between DNA copy number and gene expression across chromosome 17. DNA copy number aKeration {Upped and mRNA levels (Lowed 
are illustrated for breast cancer cell lines and tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering (Upper), and the 
identical sample order is maintained (Lower). The 354 genes present on the microarrays and mapping to chromosome 17, and for which both DNA copy number 
and mRNA levels were determined, are ordered by position along the chromosome; selected genes are Iridicated in co.lor<oded text (see Fig. 2 legend). 
Fluorescence ratios (test/reference) are depicted by separate iog2 pseudocolor scales (indicated). 



of DNA copy number and mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression are quite concordant; i.e., a signiflcant 
fraction of highfy amplified genes appear to be correspondingly 
highly expressed. The concordance between high-level amplifi- 
cation and increased gene expression is not restricted to chro- 
mosome 17. Genome-wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4, and representing 91 different 
genes), 62% (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) . 
are fotmd associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 tumors, across 6,095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA • deletion, no change, and low-, 
medium-, and high-level amplification (Fig. 4^). For both the 



breast cancer cell lines and tunaors, average mRNA levels 
tracked with DNA copy number across all five classes, in a 
statistically significant fashion {P values for pair-wise Student's 
t tests comparing adjacent classes: cell lines, 4 X 10~*', 1 x 10"*^ 
5 X 10-^ 1 X 10-2; tumors, | x 10"^^ 1 X lO'^i*, 5 x 10-*\ 
1 X 10"^). A linear regression of the average log(DNA copy 
number), for each class, against average log(mRNA level) 
demonstrated that on average, a 2-fold change in DNA copy 
number was accompanied by 1.4- and 1.5-fold changes in mRNA 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4d, regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumor siamples (Fig. 46). 
The distribution of correlations forms a nonrial-shaped curve, 
but with the peak markedly shifted in the positive direction from 
zero. This shift is statistically significant, as evidenced in a plot 
of observed vs. expected correlations (^g. 4c), and reflects a 
pervasive global influence of DNA copy number alterations on 
gene expression. Notably, the highest correlations between DNA 
copy number and mRNA level (the right tail of the distribution 
in Fig. 4b) comprise both amplified and deleted genes (data not 
shown). Third, we used a linear regression model to estimate the 
fraction of all variation measured in mRNA levels among the 37 
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Rg. 4. Genome-wide influence of DNA copy number aKeratiohs on mRNA levels, (a) For breast cancer cell lines (gray) and tumor samples (black), both 
mean-centered mRNA fluorescence ratio (Iog2 scale) quartlles (box plots indicate 25th, 50tK. and 75th percentile) and averages (diamonds; y-value error bars 
indicate standard errors of the mean) are plotted for each of five classes of genes, representing DNA deletion (tumor/normal ratio < 0.8), no change (0.8-1.2), 
low- (1.2-2), medium- (2-4), and high-level (>4) amplification. P values for pair-wise Student's t tests, corhparing averages between adjacent classes (moving 
leiFtto right), are 4 x lO"*' 1 x 10-« 5 x lO"*, 1 x ip-^ (cell lines), and 1 x 10-«, 1 x IQ-^^* 5 x ^Q-*\ 1 x 10"* (tumors), (b) Distribution of correlations between 
DNAcopy number and mRNA levels, for 6,095 different human genes aaoss37 breast tumor samples, (c) Plot of observed versus expected correlation coefficients. 
The expected values were obtained by randomization of the sample labels in the DNA copy number data set The line of unity is indicated, (d) Percent variance 
in gene expression (among tumors) directly explained by variation in gene copy number. Percent variance explained (black line) and fraction of data retained 
(gray line) are plotted for different fluorescence intensity/background (a rough surrogate for signal/noise) cutoff values^ Fraction of data retained is relative 
to the 1.2 intensity/background cutoff. Details of the linear regression model used to estimate the fraction of variation in gene expression attributable to 
underlying DNA copy number alteration can be found in the supporting information (see Estimating the Fraction of Variation in Gene Expression Attributable 
to Undertying DNA Copy Number Alteration). 



tumors that could be attributed to underlying variation in DNA 
copy number. From this analysis, we estimate that, overall, about 
1% of all of the observed variation in mRNA levels can be 
explained directly by variation in copy number of the altered 
genes (Fig. 4d). We can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data most reliably measured (fluorescence intensity/ 
background >3); using that data, our estimate of the percent 
variation in mRNA levels directly attributed to variation in gene 
copy number increases to 12% (Fig. Ad), This still undoubtedly 
represents a significant underestimate, as the observed variation 
in global gene expression is affected not only by true variation in 
the expression programs of the tumor cells themselves, but also 
by the variable presence of non- tumor cell types within clinical 
samples. 

Discission 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
usefulness of defining amplicon boundaries at high resolution 
(gene-by-gene), and quantitatively measuring amplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana- 
lyzing mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive,, direct 
effect on global gene expression patterns in both breast cancer 



cell lines and tumors. Althou^ the DNA microarrays used in our 
analysis may display a bias toward characterized and/or highly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be generalizable. 
(but would nevertheless stUl be remarkable if only applicable to 
this set of —6,100 genes). 

In budding yeast, aiieuploidy has been shown to result in 
chromosome-wide gene expression biases (13). Two recent 
studies have begun to examine the global relationship between 
DNA copy number and gene expression in cancer cells. In. 
agreement with our findings, Phillips et aL (14) have shown that 
with the acquisition of tumorigenicity in an immortalized pros- 
tate epithelial cell line, new chromosomal gains and losses 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platzer et al (15) recently reported that in metastatic 
colon tumors only -^4% of genes within amplified regions were 
found more highly (>2-fold) expressed, when compared with 
normal colonic epithelium. This report differs substaiitially from 
our finding that 62% of highly amplified genes in breast cancer 
exhibit at least 2-fold increased expression. These contrasting 
findings may reflect methodological differences between the 
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studies. For example, the study of Platzer c/ aL (15) may have 
systematically under-measured gene expression changes. In this 
regard it is remarkable that only 14 transcripts of many thousand 
residing within unamplified chromosomal regions were found to 
exhibit at least 4-fold altered expression in metastatic colon 
cancer. Additionally, their reliance on lower-resolution chromo- 
somal CGH may have resulted in poorly delimiting the bound- 
aries of high-complexity amplicons, effectively overcalling re- 
gions with amplification. Alternatively, the contrasting findings 
for amplified genes may represent real biological differences 
between breast and metastatic colon tumors; resolution of this 
issue will require further studies. 

Our finding that widespread DNA copy number alteration has 
a large, pervasive and direct effect on global gene expression 
patterns in breast cancer has several important implications. 
First, this finding supports a high degree of copy number- 
dependent gene expression in tumors. Second, it suggests that 
most genes are not subject to specific autoi^eguiation or dosage 
compensation. Third, this finding cautions that elevated expres- 
sion of an amplified gene caimot alone be considered strong 
independent evidence of a candidate oncogene's role in tumor- 
igenesis. In our study, fully 62% of highly amplified genes 
demonstrated moderately or highly elevated expression. This 
highlights the importance of high-resolution mapping of ampli- 
con boundaries and shape [to identify the "driving" gene(s) 
within amplicons (16)], on a large number of samples, in addition 
to functional studies. Fourth, this finding suggests that analyzing 
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the genomic distribution of expressed genes, even within existing 
microarray gene expression data sets, may permit the inference 
of DNA copy number aberration, particularly aneuploidy (where 
gene expression can be averaged across targe chromosomal 
regions; see Fig. 3 and supporting information). Fifth, this 
finding implies that a substantial portion of the phenotypic 
uniqueness (and by extension, the heterogeneity in clinical 
behavior) among patients' tumors may be traceable to underly- 
ing variation in DNA copy number. Sixth, this finding supports 
a possible role for widespread DNA copy number alteration in 
tumorigenesis (17, 18), beyond the amplification of specific 
oncogenes and deletion of specific tumor suppressor genes. 
Widespread DNA copy number alteration, and the concomitant 
widespread imbalance in gene expression, might disrupt critical 
stochioraetric relationships in cell metabolism and physiology 
(e.g., proteosome, mitotic spindle), possibly promoting further 
chromosomal instability and directly contributing to tumor 
development or progression. Finally, our findings suggest the 
possibility of cancer therapies that exploit specific or global 
imbalances in gene expression in cancer. 
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